Teach on Maven
Log In
  • Courses
  • Workshops
  • Free lessons

  • AI
    • All courses inAI
      • Agentic AI
      • Coding with AI
      • AI Workflows
      • Claude Code
      • OpenClaw
      • Vibe Coding
      • AI Evals
      • AI Transformation
      • RAG & Search
      • MCP
      • AI for PMs
      • AI for Engineers
      • AI for Designers
      • AI for Marketers
      • AI for Founders
  • Product
    • All courses inProduct
      • AI for PMs
      • Agentic AI
      • AI Evals
      • Vibe Coding
      • Product Sense
      • Product Discovery
      • User Research
      • Prototyping
      • Growth
      • Analytics
      • Tech Foundations
      • Strategy
      • Influence
      • Leadership
      • Career Growth
  • Engineering
    • All courses inEngineering
      • AI for Engineers
      • Agentic AI
      • Coding with AI
      • Claude Code
      • OpenClaw
      • MCP
      • RAG & Search
      • AI Evals
      • Machine Learning
      • LLM Ops
      • Context Eng
      • Security
      • System Design
      • Leadership
      • Career Growth
  • Design
    • All courses inDesign
      • AI for Designers
      • Agentic AI
      • Vibe Coding
      • Prototyping
      • Figma
      • Design Systems
      • User Research
      • Product Discovery
      • UX
      • UI
      • Visual Design
      • Design Strategy
      • Influence
      • Leadership
      • Career Growth
  • Marketing
    • All courses inMarketing
      • AI for Marketers
      • Agentic AI
      • Vibe Coding
      • Automation
      • Content Marketing
      • Demand Gen
      • Go-to-Market
      • Product Marketing
      • Positioning
      • Social Media
      • Brand
      • B2B Marketing
      • SEO & AEO
      • Strategy
      • Leadership
  • Leadership
    • All courses inLeadership
      • AI for Leaders
      • Agentic AI
      • AI Transformation
      • AI Governance
      • Communication
      • Influence
      • Strategy
      • Management
      • People Operations
      • Exec Presence
      • Storytelling
      • Goal-setting
      • Personal Brand
      • Career Growth
  • Founders
    • All courses inFounders
      • AI for Founders
      • Agentic AI
      • AI Workflows
      • Vibe Coding
      • Prototyping
      • Product Sense
      • Positioning
      • Product Discovery
      • Management
      • Strategy
      • Go-to-Market
      • Personal Brand
      • Leadership
      • Fundraising
      • PMF
  • More
    • All courses inMore
      • Everyone
      • Operators
      • Data Scientists
      • Business Analysts
      • User Researchers
      • Customer Success
      • Project Managers
      • HR Professionals
      • Sales People
      • Lawyers
      • Finance
      • Investors
      • Real Estate
      • Educators
      • Creators
Engineering
Teach on Maven
Log In
Engineering
Teach on Maven
Log In
AI EvalsAI for EngineersAgentic AICoding with AIClaude CodeOpenClawMCPRAG & SearchMachine LearningLLM OpsContext EngSecuritySystem DesignLeadershipCareer Growth

Cohort-based courses

Guided programs to get real results.

AI Evals For Engineers & PMs

4.7
·4 weeks·Sep 5 – Oct 3
Hamel Husain ML Engineer with 25 years of experience
Shreya Shankar ML Systems & Applied AI Evals Researcher

Production-Ready Systems with LLMs and Agents: An Intensive for Engineers

NEW·6 weeks·Jul 13 – Aug 23
Ehsan Gazar Principal Engineer | AI & System Design

Hardcore Agentic Engineering for builders who ship

NEW·3 weeks·Aug 3 – Aug 21
Greg Ceccarelli Founder, SpecStory • Ex-CPO, Pluralsight

+ Sean Johnson and Jake Levirne

AI Evals and Analytics Playbook

5.0
·3 weeks·Jun 22 – Jul 13
Stella Liu Head of AI Applied Science
Amy Chen Cofounder, AI Evals & Analytics

Beyond Evals: Designing Improvement Flywheels for AI Products

NEW·3 weeks·Sep 19 – Oct 10
Aishwarya Naresh Reganti AI Founder & Advisor to F500s | Ex-AWS
Kiriti Badam Applied AI @ OpenAI Codex | Ex-Google

Build a Software Factory: Hands-off agentic coding for experienced engineers

3 weeks·Jul 14 – Aug 6
Matt Wynne Cucumber co-founder, BDD pioneer

+ Aldric Giacomoni, Jeremy Lightsmith, and David Laing

1-day workshops

Short, focused sessions to build specific skills.

Building Multi-Agent Forecasting Systems

NEW·7 hours·Jun 27
Stefan Jansen Author, ML for Trading · Applied AI

Debug Your AI Product: Private Team Workshop

Jun 30
Hamel Husain ML Engineer with 20+ years experience
Shreya Shankar ML Systems Researcher

AI Trust at Scale. From Evals to Governance

NEW·3 hours·Jul 11
Subha Shetty Fractional Chief Product and AI Officer

Free Lightning Lessons

Interactive sessions to explore new topics.
  • Mastering Agentic RAG & AI Evals

    Watch
    ·60 minutes1,146 Students
    Dr. Ryan Ahmed, Ph.D., MBA and Kukesh Kodess
  • Ship a Production Cursor Agent System in 30 Minutes

    Live
    ·Jun 24·30 minutes164 Students
    Carmelo Iaria
  • How to Setup Evals For Agents

    Watch
    ·30 minutes1,848 Students
    Harrison Chase, Hamel Husain, and
  • Pressure-test any AI analysis

    Live
    ·Jun 24·60 minutes781 Students
    Shane Butler, Sravya Madipalli, and Hai Guan
  • Raise Your Technical Bar as an AI-Native PM

    Watch
    ·30 minutes15,903 Students
    Jason P. Yoong and Gayathri Keerthana (GK)
  • Build Multi-Agent Systems You Can Audit

    Live
    ·Jun 24·30 minutes37 Students
    Stefan Jansen
  • From trading idea to validated strategy

    Live
    ·Jun 24·30 minutes21 Students
    Stefan Jansen
  • The New Frontier of AI Search

    Live
    ·Jun 30·75 minutes16 Students
    Trey Grainger and Doug Turnbull
  • AI Evals for Product Managers

    Watch
    ·60 minutes2,042 Students
    Anshumani Ruddra
  • Modern Information Retrieval Evaluation In The RAG Era

    Watch
    ·45 minutes5,365 Students
    Nandan Thakur, Hamel Husain, and Shreya Shankar
  • Build Your AI Evals & Analytics Playbook

    Watch
    ·30 minutes532 Students
    Stella Liu and Amy Chen
  • Design Evals Users Will Trust

    Watch
    ·45 minutes781 Students
    Aishwarya Naresh Reganti
  • Evals in Action With Arize

    Watch
    ·45 minutes210 Students
    Laurie Voss
  • Setting Eval for AI Agents & Scaling with Auto-Evaluation

    Watch
    ·30 minutes866 Students
    Mahesh Yadav
  • Evaluate AI agents with Confidence

    Watch
    ·45 minutes803 Students
    Mahesh Yadav
  • Production Grade AI Evals by Braintrust.dev

    Watch
    ·30 minutes509 Students
    Mengying Li
  • Scale Evals Without the Chaos

    Watch
    ·45 minutes255 Students
    Aishwarya Naresh Reganti
  • Collaborative AI Evals with Human Feedback

    Watch
    ·30 minutes120 Students
    Rogério Chaves
  • Debug the weird stuff your AI does (in less than 1 hour)

    Watch
    ·45 minutes5,171 Students
    Marily Nika and Hamel Husain
  • Automating Evals With Claude Code + Phoenix

    Watch
    ·60 minutes2,362 Students
    Mikyo King and Hamel Husain
  • Evals for Everyone

    Watch
    ·3 lessons2,202 Students
    Aishwarya & Kiriti
  • Evaluating AI Agents

    Watch
    ·45 minutes1,434 Students
    Amir Feizpour and Samuel Dion-Girardeau
  • From Automation to Multi-Agent Architectures

    Watch
    ·3 lessons1,361 Students
    Hamza Farooq
  • Build Your Own Eval Tools With Notebooks!

    Watch
    ·45 minutes615 Students
    Vincent D. Warmerdam, Hamel Husain, and Shreya Shankar
  • Evaluation Driven Development for Agentic AI Systems

    Watch
    ·45 minutes591 Students
    Aurimas Griciūnas
  • Strategies for building self-improving document processing

    Watch
    ·60 minutes431 Students
    Jason Liu and Eli Badgio
  • How to Drive AI Evals Adoption

    Watch
    ·30 minutes333 Students
    Dr Sebastian Fox
  • Stay Ahead in AI: Evaluate Any New LLM in 15 Minutes

    Watch
    ·30 minutes95 Students
    Sherveen Mashayekhi
  • Evaluating AI Agents before Users Break Them

    Watch
    ·60 minutes89 Students
    Aki Wijesundara, PhD, Marc Klingen, and Lotte Verheyden
  • Run Eval Loops and Guardrails for Cursor Agents

    Watch
    ·30 minutes88 Students
    Carmelo Iaria
  • Setting up your first AI eval with a LLM-as-judge

    Watch
    ·45 minutes68 Students
    Madalina Turlea and Catalina Turlea
  • Go Beyond AI Evals: Diagnose and Decide

    Watch
    ·45 minutes55 Students
    Rajiv Shah
  • Debug Cursor Agent Failures Before Production

    Watch
    ·30 minutes46 Students
    Carmelo Iaria
  • How to test AI when you don't have any data yet

    Watch
    ·45 minutes26 Students
    Madalina Turlea and Catalina Turlea
  • Error Analysis: The AI Engineer’s Best ROI

    Watch
    ·60 minutes1,518 Students
    Hamel Husain and Shreya Shankar
  • Evaluating Agentic AI Applications Beyond Vibe Checks

    Watch
    ·45 minutes1,255 Students
    Aishwarya Naresh Reganti, Kiriti Badam, and Claire Longo
  • Understanding Embedding Performance through Generative Evals

    Watch
    ·60 minutes1,181 Students
    Jason Liu and Kelly Hong
  • How OpenAI Customers Use Evals To Build Better AI Products

    Watch
    ·30 minutes1,083 Students
    Jim Blomo and Hamel Husain
  • How Evals Made GitHub Copilot Happen

    Watch
    ·30 minutes893 Students
    John Berryman, Shawn Simister, and Hamel Husain
  • Learn Agentic AI: Setting agents metrics and evaluations

    Watch
    ·45 minutes861 Students
    Mahesh Yadav
  • Optimize Structured Data Retrieval With Evals

    Watch
    ·45 minutes843 Students
    Daniel Svonava and Hamel Husain
  • Online Evals and Production Monitoring

    Watch
    ·60 minutes831 Students
    Jason Liu, Ben Hylak, and Sidhant Bendre
  • AI Systems Under Pressure: Red-Team Before You Ship

    Watch
    ·60 minutes803 Students
    Krystal Jackson
  • Improve reliability of your AI applications

    Watch
    ·30 minutes747 Students
    Shreya Rajpal
  • Optimize Your Dev Setup For Evals w/ Cursor Rules & MCP

    Watch
    ·30 minutes689 Students
    Isaac Flath, Hamel Husain, and Shreya Shankar
  • How You Catch Production Hallucinations in Real Time

    Watch
    ·60 minutes505 Students
    Jason Liu and Julia Neagu
  • Scaling Judge-Time Compute for Robust Auto LLM Evaluation

    Watch
    ·60 minutes489 Students
    Jason Liu and Leonard Tang
  • Practical Evaluation Strategies for AI Agents

    Watch
    ·45 minutes476 Students
    Hamza Farooq and Gabriela de Queiroz
  • Master Evaluation Techniques for LLM Apps

    Watch
    ·30 minutes413 Students
    Haroon Choudery
  • Understand SHAP (SHapley Additive exPlanations)

    Watch
    ·30 minutes311 Students
    Patrick Hall
  • Reliable RAG Agents: Intent-Driven Failure Detection

    Watch
    ·60 minutes298 Students
    Jason Liu and Ben Hylak
  • Create MCP Tool Evals Before You Ship

    Watch
    ·45 minutes283 Students
    Emmanuel Paraskakis
  • Don't Tweak Prompts. Engineer Agents.

    Watch
    ·30 minutes274 Students
    Hugo Bowne-Anderson and Skylar Payne
  • Evals for Voice AI: Learnings from Google Evals Team

    Watch
    ·30 minutes246 Students
    Ravin Kumar
  • Mastering LLM Application Testing

    Watch
    ·30 minutes240 Students
    Hugo Bowne-Anderson and Stefan Krawczyk
  • Synthetic RAG evaluation

    Watch
    ·60 minutes214 Students
    Alexey Grigorev and Doug Turnbull
  • Calibrate LLM-as-a-judge for Real-world Impact

    Watch
    ·45 minutes209 Students
    Eddie Landesberg
  • 🛠 Synthetic Data Flywheels: Build Reliable LLM Apps Faster

    Watch
    ·30 minutes187 Students
    Hugo Bowne-Anderson and Stefan Krawczyk
  • The Hidden Signal in Production AI Logs

    Watch
    ·60 minutes173 Students
    Jason Liu and Scott Clark
  • How to test and improve your AI agents

    Watch
    ·45 minutes167 Students
    Jacob Bank
  • De-Risking LLM Model Switches w Evals & Prompt Optimization

    Watch
    ·45 minutes145 Students
    Amir Feizpour and Hugo Mailhot
  • Part 3: Building Robust Evaluations for AI Agents

    Watch
    ·60 minutes144 Students
    Hamza Farooq and Gabriela de Queiroz

Browse by topic

  • AI for Engineers
  • Agentic AI
  • Coding with AI
  • Claude Code
  • OpenClaw
  • MCP
  • RAG & Search
  • AI Evals
  • Machine Learning
  • LLM Ops
  • Context Eng
  • Security
  • System Design
  • Leadership
  • Career Growth
Engineering

AI Evals

Cohort-based courses

Guided programs to get real results.

AI Evals For Engineers & PMs

4.7
·4 weeks·Sep 5 – Oct 3
Hamel Husain ML Engineer with 25 years of experience
Shreya Shankar ML Systems & Applied AI Evals Researcher

Production-Ready Systems with LLMs and Agents: An Intensive for Engineers

NEW·6 weeks·Jul 13 – Aug 23
Ehsan Gazar Principal Engineer | AI & System Design

Hardcore Agentic Engineering for builders who ship

NEW·3 weeks·Aug 3 – Aug 21
Greg Ceccarelli Founder, SpecStory • Ex-CPO, Pluralsight

+ Sean Johnson and Jake Levirne

AI Evals and Analytics Playbook

5.0
·3 weeks·Jun 22 – Jul 13
Stella Liu Head of AI Applied Science
Amy Chen Cofounder, AI Evals & Analytics

Beyond Evals: Designing Improvement Flywheels for AI Products

NEW·3 weeks·Sep 19 – Oct 10
Aishwarya Naresh Reganti AI Founder & Advisor to F500s | Ex-AWS
Kiriti Badam Applied AI @ OpenAI Codex | Ex-Google

Build a Software Factory: Hands-off agentic coding for experienced engineers

3 weeks·Jul 14 – Aug 6
Matt Wynne Cucumber co-founder, BDD pioneer

+ Aldric Giacomoni, Jeremy Lightsmith, and David Laing

1-day workshops

Short, focused sessions to build specific skills.

Building Multi-Agent Forecasting Systems

NEW·7 hours·Jun 27
Stefan Jansen Author, ML for Trading · Applied AI

Debug Your AI Product: Private Team Workshop

Jun 30
Hamel Husain ML Engineer with 20+ years experience
Shreya Shankar ML Systems Researcher

AI Trust at Scale. From Evals to Governance

NEW·3 hours·Jul 11
Subha Shetty Fractional Chief Product and AI Officer

Free Lightning Lessons

Interactive sessions to explore new topics.
  • Mastering Agentic RAG & AI Evals

    Watch
    ·60 minutes1,146 Students
    Dr. Ryan Ahmed, Ph.D., MBA and Kukesh Kodess
  • Ship a Production Cursor Agent System in 30 Minutes

    Live
    ·Jun 24·30 minutes164 Students
    Carmelo Iaria
  • How to Setup Evals For Agents

    Watch
    ·30 minutes1,848 Students
    Harrison Chase, Hamel Husain, and
  • Pressure-test any AI analysis

    Live
    ·Jun 24·60 minutes781 Students
    Shane Butler, Sravya Madipalli, and Hai Guan
  • Raise Your Technical Bar as an AI-Native PM

    Watch
    ·30 minutes15,903 Students
    Jason P. Yoong and Gayathri Keerthana (GK)
  • Build Multi-Agent Systems You Can Audit

    Live
    ·Jun 24·30 minutes37 Students
    Stefan Jansen
  • From trading idea to validated strategy

    Live
    ·Jun 24·30 minutes21 Students
    Stefan Jansen
  • The New Frontier of AI Search

    Live
    ·Jun 30·75 minutes16 Students
    Trey Grainger and Doug Turnbull
  • AI Evals for Product Managers

    Watch
    ·60 minutes2,042 Students
    Anshumani Ruddra
  • Modern Information Retrieval Evaluation In The RAG Era

    Watch
    ·45 minutes5,365 Students
    Nandan Thakur, Hamel Husain, and Shreya Shankar
  • Build Your AI Evals & Analytics Playbook

    Watch
    ·30 minutes532 Students
    Stella Liu and Amy Chen
  • Design Evals Users Will Trust

    Watch
    ·45 minutes781 Students
    Aishwarya Naresh Reganti
  • Evals in Action With Arize

    Watch
    ·45 minutes210 Students
    Laurie Voss
  • Setting Eval for AI Agents & Scaling with Auto-Evaluation

    Watch
    ·30 minutes866 Students
    Mahesh Yadav
  • Evaluate AI agents with Confidence

    Watch
    ·45 minutes803 Students
    Mahesh Yadav
  • Production Grade AI Evals by Braintrust.dev

    Watch
    ·30 minutes509 Students
    Mengying Li
  • Scale Evals Without the Chaos

    Watch
    ·45 minutes255 Students
    Aishwarya Naresh Reganti
  • Collaborative AI Evals with Human Feedback

    Watch
    ·30 minutes120 Students
    Rogério Chaves
  • Debug the weird stuff your AI does (in less than 1 hour)

    Watch
    ·45 minutes5,171 Students
    Marily Nika and Hamel Husain
  • Automating Evals With Claude Code + Phoenix

    Watch
    ·60 minutes2,362 Students
    Mikyo King and Hamel Husain
  • Evals for Everyone

    Watch
    ·3 lessons2,202 Students
    Aishwarya & Kiriti
  • Evaluating AI Agents

    Watch
    ·45 minutes1,434 Students
    Amir Feizpour and Samuel Dion-Girardeau
  • From Automation to Multi-Agent Architectures

    Watch
    ·3 lessons1,361 Students
    Hamza Farooq
  • Build Your Own Eval Tools With Notebooks!

    Watch
    ·45 minutes615 Students
    Vincent D. Warmerdam, Hamel Husain, and Shreya Shankar
  • Evaluation Driven Development for Agentic AI Systems

    Watch
    ·45 minutes591 Students
    Aurimas Griciūnas
  • Strategies for building self-improving document processing

    Watch
    ·60 minutes431 Students
    Jason Liu and Eli Badgio
  • How to Drive AI Evals Adoption

    Watch
    ·30 minutes333 Students
    Dr Sebastian Fox
  • Stay Ahead in AI: Evaluate Any New LLM in 15 Minutes

    Watch
    ·30 minutes95 Students
    Sherveen Mashayekhi
  • Evaluating AI Agents before Users Break Them

    Watch
    ·60 minutes89 Students
    Aki Wijesundara, PhD, Marc Klingen, and Lotte Verheyden
  • Run Eval Loops and Guardrails for Cursor Agents

    Watch
    ·30 minutes88 Students
    Carmelo Iaria
  • Setting up your first AI eval with a LLM-as-judge

    Watch
    ·45 minutes68 Students
    Madalina Turlea and Catalina Turlea
  • Go Beyond AI Evals: Diagnose and Decide

    Watch
    ·45 minutes55 Students
    Rajiv Shah
  • Debug Cursor Agent Failures Before Production

    Watch
    ·30 minutes46 Students
    Carmelo Iaria
  • How to test AI when you don't have any data yet

    Watch
    ·45 minutes26 Students
    Madalina Turlea and Catalina Turlea
  • Error Analysis: The AI Engineer’s Best ROI

    Watch
    ·60 minutes1,518 Students
    Hamel Husain and Shreya Shankar
  • Evaluating Agentic AI Applications Beyond Vibe Checks

    Watch
    ·45 minutes1,255 Students
    Aishwarya Naresh Reganti, Kiriti Badam, and Claire Longo
  • Understanding Embedding Performance through Generative Evals

    Watch
    ·60 minutes1,181 Students
    Jason Liu and Kelly Hong
  • How OpenAI Customers Use Evals To Build Better AI Products

    Watch
    ·30 minutes1,083 Students
    Jim Blomo and Hamel Husain
  • How Evals Made GitHub Copilot Happen

    Watch
    ·30 minutes893 Students
    John Berryman, Shawn Simister, and Hamel Husain
  • Learn Agentic AI: Setting agents metrics and evaluations

    Watch
    ·45 minutes861 Students
    Mahesh Yadav
  • Optimize Structured Data Retrieval With Evals

    Watch
    ·45 minutes843 Students
    Daniel Svonava and Hamel Husain
  • Online Evals and Production Monitoring

    Watch
    ·60 minutes831 Students
    Jason Liu, Ben Hylak, and Sidhant Bendre
  • AI Systems Under Pressure: Red-Team Before You Ship

    Watch
    ·60 minutes803 Students
    Krystal Jackson
  • Improve reliability of your AI applications

    Watch
    ·30 minutes747 Students
    Shreya Rajpal
  • Optimize Your Dev Setup For Evals w/ Cursor Rules & MCP

    Watch
    ·30 minutes689 Students
    Isaac Flath, Hamel Husain, and Shreya Shankar
  • How You Catch Production Hallucinations in Real Time

    Watch
    ·60 minutes505 Students
    Jason Liu and Julia Neagu
  • Scaling Judge-Time Compute for Robust Auto LLM Evaluation

    Watch
    ·60 minutes489 Students
    Jason Liu and Leonard Tang
  • Practical Evaluation Strategies for AI Agents

    Watch
    ·45 minutes476 Students
    Hamza Farooq and Gabriela de Queiroz
  • Master Evaluation Techniques for LLM Apps

    Watch
    ·30 minutes413 Students
    Haroon Choudery
  • Understand SHAP (SHapley Additive exPlanations)

    Watch
    ·30 minutes311 Students
    Patrick Hall
  • Reliable RAG Agents: Intent-Driven Failure Detection

    Watch
    ·60 minutes298 Students
    Jason Liu and Ben Hylak
  • Create MCP Tool Evals Before You Ship

    Watch
    ·45 minutes283 Students
    Emmanuel Paraskakis
  • Don't Tweak Prompts. Engineer Agents.

    Watch
    ·30 minutes274 Students
    Hugo Bowne-Anderson and Skylar Payne
  • Evals for Voice AI: Learnings from Google Evals Team

    Watch
    ·30 minutes246 Students
    Ravin Kumar
  • Mastering LLM Application Testing

    Watch
    ·30 minutes240 Students
    Hugo Bowne-Anderson and Stefan Krawczyk
  • Synthetic RAG evaluation

    Watch
    ·60 minutes214 Students
    Alexey Grigorev and Doug Turnbull
  • Calibrate LLM-as-a-judge for Real-world Impact

    Watch
    ·45 minutes209 Students
    Eddie Landesberg
  • 🛠 Synthetic Data Flywheels: Build Reliable LLM Apps Faster

    Watch
    ·30 minutes187 Students
    Hugo Bowne-Anderson and Stefan Krawczyk
  • The Hidden Signal in Production AI Logs

    Watch
    ·60 minutes173 Students
    Jason Liu and Scott Clark
  • How to test and improve your AI agents

    Watch
    ·45 minutes167 Students
    Jacob Bank
  • De-Risking LLM Model Switches w Evals & Prompt Optimization

    Watch
    ·45 minutes145 Students
    Amir Feizpour and Hugo Mailhot
  • Part 3: Building Robust Evaluations for AI Agents

    Watch
    ·60 minutes144 Students
    Hamza Farooq and Gabriela de Queiroz

Contact support: support@maven.com

Learn

    Courses
    Workshops
    Free lessons
    Expense a course

Teach

    Teach on Maven
    Instructor resources

Maven

    About us
    Careers
    Help center
    Privacy policy
    Terms of service

© 2026 Maven Learning, Inc.