Turn Eval Results Into a Better Model

Free Lesson

Turn Eval Results Into a Better Model

Part of AI Product Engineering

•

Hosted by Will Brown, Florian Brand, and Hamel Husain

Learn directly from Will Brown, Florian Brand, and Hamel Husain

Watch this lesson for free

By continuing, you agree to Maven's Terms and Privacy Policy.

Share this lesson

Go deeper with a course

Featured in Lenny’s List

AI Evals For Engineers & PMs

Hamel Husain and Shreya Shankar

View syllabus

279 students

Share this lesson

Go deeper with a course

Featured in Lenny’s List

AI Evals For Engineers & PMs

Hamel Husain and Shreya Shankar

View syllabus

What you'll learn

Build the right environment

Turn your system into an environment to evaluate and train models in

Know when to own a model

Decide when owning and training a model beats renting a closed one.

Close the loop with RL

Use reinforcement learning so the model gets better at your real tasks each round.

Why this topic matters

Environments are the foundation for evaluating and training models and agentic systems. They define the tasks, provide the context and tools, and determine how success is measured. Florian and Will from Prime Intellect will show you how to build effective environments, avoid common pitfalls, and use them to evaluate and train stronger models.

You'll learn from

Will Brown

Research Lead at Prime Intellect

Will Brown is Research Lead at Prime Intellect, where he builds open-source research and infrastructure for agentic reinforcement learning, including the verifiers library. He holds a PhD in algorithmic game theory from Columbia, co-advised by Christos Papadimitriou and Tim Roughgarden, and previously worked in Morgan Stanley's machine learning research group.

Florian Brand

Research Engineer at Prime Intellect

Florian Brand is a Research Engineer at Prime Intellect, where he works on LLM evals. He focuses on the practical side of evaluation: reproducibility, infrastructure, and building signals that models can actually improve on. He also writes and contributes to discussions on open models.

Hamel Husain

ML Engineer with 20+ years of experience

Hamel Husain is a ML Engineer with 20+ years of experience. He has worked with companies such as Airbnb and GitHub, which included early LLM research used by OpenAI, for code understanding. He has also led and contributed to numerous popular open-source machine-learning tools. Hamel is currently an independent developer helping companies with applied evals.

See all products from Hamel Husain & Shreya Shankar

Watch this lesson for free

By continuing, you agree to Maven's Terms and Privacy Policy.