About Etude AI

Advancing AI through the discipline of deliberate practice — rigorous evaluation, open science, and the belief that mastery is earned, not given.

Our Mission

We believe AI systems should practice, not just train.

In classical music, a performer doesn't simply learn the notes — they return to difficult passages again and again, refining technique through deliberate, structured repetition. Each practice session targets a specific weakness. Each run-through is measured against a higher standard.

We believe AI development deserves the same discipline. Not just training on vast data, but practicing against rigorous benchmarks. Not just scaling parameters, but identifying weaknesses and working to overcome them. The path to intelligence isn't a shortcut — it's a practice room.

Our Story

Founded in Ontario, Canada

The name "Etude" comes from the French word for "study." In classical music, an etude is a composition designed to perfect a specific technique through deliberate practice. Chopin's etudes, for instance, are not merely exercises — they are works of art that transform technical challenge into beauty.

We founded Etude AI with this philosophy at our core. We saw a field obsessed with scale but often indifferent to rigor — models measured by benchmarks that no longer challenged them, evaluated by metrics that no longer meant anything. We believed there was a better way.

From Ontario, Canada, we set out to build the tools, frameworks, and benchmarks that would bring deliberate practice to artificial intelligence. Not just bigger models, but better evaluation. Not just more data, but deeper understanding. Every etude we compose is designed to reveal what a model truly knows — and what it still has left to learn.

Our Approach

Three pillars guide everything we build — each one inspired by the discipline of the practice room.

Rigorous Evaluation

We design benchmarks that probe genuine understanding, not pattern matching. Our evaluations are dynamic, adversarial, and grounded in real-world complexity — because a test that can be gamed teaches nothing.

Deliberate Practice

Inspired by how musicians master their craft, we build frameworks for iterative self-improvement. Systems that identify weaknesses, target them with precision, and measure progress with honesty.

Open Science

Every benchmark, dataset, and tool we create is open-source and peer-reviewed. Reproducibility is not optional. Science that can't be scrutinized isn't science — it's marketing.

Our Team

Adam Newman

Founder & CEO

PhD, University of Toronto. Previously research lead at Google DeepMind.

Alex Green

Research Scientist

PhD, Stanford University. Former postdoc at MIT CSAIL, focused on evaluation methodology.

Emma Nash

Research Engineer

MSc, University of Waterloo. Previously at Cohere, building agent training infrastructure.

Tyler Irwin

ML Engineer

MSc, ETH Zurich. Former ML engineer at Hugging Face, specializing in multimodal models.

Chloe Adams

Software Engineer

BCS, University of Waterloo. Previously at Shopify, building distributed systems at scale.

Ivy Torres

Product Designer

MDes, OCAD University. Previously at Figma, designing developer tool interfaces.

Eli A. Moore

Head of Operations

MBA, Rotman School of Management. Previously at Vector Institute, managing research operations.

Join our ensemble

Our Values

The principles that shape our work, our culture, and the standards we hold ourselves to.

Rigor

We hold our work to the highest standard. Every claim is tested, every benchmark validated, every result reproducible. Precision is not pedantry — it is respect for the truth.

Openness

Our research, code, and data are open by default. We publish our methods so others can build on them, challenge them, and improve them. Science advances through transparency.

Craft

Like a musician polishing a phrase until it sings, we care deeply about the quality of our work. The elegance of an evaluation matters as much as its coverage. Details are not incidental — they are the work.

Humility

We know what we don't know. We design evaluations that reveal our own blind spots. The hardest part of practice is confronting what still needs work — and we welcome that discomfort.