Cohort-based courses
Guided programs to get real results.
AI Evals For Engineers & PMs
4.7
·4 weeks·Sep 7 – Oct 2
Hamel Husain ML Engineer with 20 years of experience
Shreya Shankar ML Systems & Applied AI Evals Researcher
Beyond Evals: Designing Improvement Flywheels for AI Products
NEW·3 weeks·Jun 6 – Jun 27
.png&w=256&q=75)
Aishwarya Naresh RegantiAI Founder & Advisor to F500s | Ex-AWS
1-day workshops
Short, focused sessions to build specific skills.
Free Lightning Lessons
Interactive sessions to explore new topics.
How to Setup Evals For Agents
·30 minutes1,611 StudentsWatch
Harrison Chase, Hamel Husain, andRaise Your Technical Bar as an AI-Native PM
·30 minutes15,831 StudentsWatch
Jason P. Yoong and Gayathri Keerthana (GK)Debug the weird stuff your AI does (in less than 1 hour)
·45 minutes5,165 StudentsWatch.webp&w=1536&q=75)
Marily Nika and Hamel HusainFrom Automation to Multi-Agent Architectures
·3 lessons1,351 StudentsWatch
Hamza FarooqVibe Code Annotation UIs for AI Analytics Evals
·Jun 24·60 minutes633 StudentsLive
Shane ButlerBuild Your AI Evals & Analytics Playbook
·30 minutes509 StudentsWatch
Stella Liu and Amy ChenAI Evals for Product Managers
·60 minutes2,011 StudentsWatch
Anshumani RuddraDebug Cursor Agent Failures Before Production
·Jun 10·30 minutes39 StudentsLive
Carmelo IariaEvaluation Driven Development for Agentic AI Systems
·45 minutes586 StudentsWatch
Aurimas GriciūnasProduction Grade AI Evals by Braintrust.dev
·30 minutes490 StudentsWatch
Mengying LiPractical Evaluation Strategies for AI Agents
·45 minutes471 StudentsWatch
Hamza Farooq and Gabriela de QueirozShip a Production Cursor Agent System in 30 Minutes
·Jun 24·30 minutes93 StudentsLive
Carmelo IariaLearn Agentic AI: Setting agents metrics and evaluations
·45 minutes855 StudentsWatch
Mahesh YadavHow to Drive AI Evals Adoption
·30 minutes325 StudentsWatch
Dr Sebastian FoxEvals for Voice AI: Learnings from Google Evals Team
·30 minutes242 StudentsWatch
Ravin KumarDesign Evals Users Will Trust
·45 minutes770 StudentsWatch
Aishwarya Naresh RegantiEvals for Everyone
·3 lessons2,195 StudentsWatchSetting Eval for AI Agents & Scaling with Auto-Evaluation
·30 minutes862 StudentsWatch
Mahesh YadavModern Information Retrieval Evaluation In The RAG Era
·45 minutes5,281 StudentsWatch
Nandan Thakur, Hamel Husain, and Shreya ShankarBuild Your Own Eval Tools With Notebooks!
·45 minutes612 StudentsWatch
Vincent D. Warmerdam, Hamel Husain, and Shreya ShankarPart 3: Building Robust Evaluations for AI Agents
·60 minutes140 StudentsWatch
Hamza Farooq and Gabriela de QueirozSetting up your first AI eval with a LLM-as-judge
·45 minutes62 StudentsWatch
Madalina Turlea and Catalina TurleaCollaborative AI Evals with Human Feedback
·30 minutes113 StudentsWatch
Rogério ChavesRun Eval Loops and Guardrails for Cursor Agents
·May 27·30 minutes79 StudentsLive
Carmelo IariaMaster Evaluation Techniques for LLM Apps
·30 minutes412 StudentsWatch
Haroon ChouderyMastering LLM Application Testing
·30 minutes240 StudentsWatch
Hugo Bowne-Anderson and Stefan KrawczykEvaluating Agentic AI Applications Beyond Vibe Checks
·45 minutes1,250 StudentsWatch
Aishwarya Naresh Reganti, Kiriti Badam, and Claire LongoHow Evals Made GitHub Copilot Happen
·30 minutes892 StudentsWatch
John Berryman, Shawn Simister, and Hamel HusainReliable RAG Agents: Intent-Driven Failure Detection
·60 minutes298 StudentsWatch
Jason Liu and Ben HylakStrategies for building self-improving document processing
·60 minutes429 StudentsWatch
Jason Liu and Eli BadgioThe Hidden Signal in Production AI Logs
·60 minutes172 StudentsWatch
Jason Liu and Scott ClarkCalibrate LLM-as-a-judge for Real-world Impact
·45 minutes205 StudentsWatch
Eddie LandesbergEvaluating AI Agents before Users Break Them
·60 minutes88 StudentsWatch
Aki Wijesundara, PhD, Marc Klingen, and Lotte VerheydenAutomating Evals With Claude Code + Phoenix
·60 minutes2,354 StudentsWatch
Mikyo King and Hamel HusainHow to test and improve your AI agents
·45 minutes167 StudentsWatch
Jacob BankGo Beyond AI Evals: Diagnose and Decide
·45 minutes52 StudentsWatch
Rajiv ShahUnderstand SHAP (SHapley Additive exPlanations)
·30 minutes310 StudentsWatch
Patrick HallImprove reliability of your AI applications
·30 minutes747 StudentsWatch
Shreya RajpalEvaluating AI Agents
·45 minutes1,428 StudentsWatch
Amir Feizpour and Samuel Dion-GirardeauDe-Risking LLM Model Switches w Evals & Prompt Optimization
·45 minutes145 StudentsWatch
Amir Feizpour and Hugo MailhotEvaluate AI agents with Confidence
·45 minutes800 StudentsWatch
Mahesh YadavError Analysis: The AI Engineer’s Best ROI
·60 minutes1,514 StudentsWatch
Hamel Husain and Shreya Shankar🛠 Synthetic Data Flywheels: Build Reliable LLM Apps Faster
·30 minutes187 StudentsWatch
Hugo Bowne-Anderson and Stefan KrawczykUnderstanding Embedding Performance through Generative Evals
·60 minutes1,181 StudentsWatch
Jason Liu and Kelly HongOnline Evals and Production Monitoring
·60 minutes831 StudentsWatch
Jason Liu, Ben Hylak, and Sidhant BendreOptimize Structured Data Retrieval With Evals
·45 minutes843 StudentsWatch
Daniel Svonava and Hamel HusainScaling Judge-Time Compute for Robust Auto LLM Evaluation
·60 minutes489 StudentsWatch
Jason Liu and Leonard TangHow OpenAI Customers Use Evals To Build Better AI Products
·30 minutes1,080 StudentsWatch
Jim Blomo and Hamel HusainOptimize Your Dev Setup For Evals w/ Cursor Rules & MCP
·30 minutes686 StudentsWatch
Isaac Flath, Hamel Husain, and Shreya ShankarHow You Catch Production Hallucinations in Real Time
·60 minutes504 StudentsWatch
Jason Liu and Julia NeaguDon't Tweak Prompts. Engineer Agents.
·30 minutes274 StudentsWatch
Hugo Bowne-Anderson and Skylar PayneSynthetic RAG evaluation
·60 minutes210 StudentsWatch
Alexey Grigorev and Doug TurnbullStay Ahead in AI: Evaluate Any New LLM in 15 Minutes
·30 minutes93 StudentsWatch
Sherveen MashayekhiAI Systems Under Pressure: Red-Team Before You Ship
·60 minutes802 StudentsWatch
Krystal JacksonCreate MCP Tool Evals Before You Ship
·45 minutes282 StudentsWatch
Emmanuel ParaskakisHow to test AI when you don't have any data yet
·45 minutes23 StudentsWatch
Madalina Turlea and Catalina TurleaScale Evals Without the Chaos
·45 minutes248 StudentsWatch
Aishwarya Naresh RegantiEvals in Action With Arize
·45 minutes200 StudentsWatch
Laurie Voss
