AI Evals and Analytics Playbook

Stella Liu

Head of AI Applied Science

Amy Chen

Cofounder, AI Evals & Analytics

Learn How to Evaluate GenAI Products Confidently in 3 Weeks

Building AI is easier than ever.
Building AI that is reliable, safe, and trusted is not.

AI evaluations help teams measure quality, detect failures, and build guardrails around your GenAI systems that will survive in the real world.

🔹

Who this course is for

Data scientists, QA/QE professionals, and PMs who:

  • have no prior background in AI evals, or

  • have tried LLM-as-a-judge but want a clearer framework.

🔹

How the course works

  • Self-paced lectures you can watch anytime

  • Hands-on exercises using a pre-built GenAI product

  • Dedicated office hours for questions and guidance

🔹

What makes this course different
  • Playbook-driven approach

  • Evaluation-driven development

  • Product-focused

  • Hands-on experience

  • Flexible learning + expert support

🔹

Curious about the ideas behind this course?

You can explore related topics in the Data Science × AI Substack, where we share perspectives on evaluating real GenAI products.

What you’ll learn

Learn how to evaluate, ship, and monitor GenAI products that perform reliably in production.

  • Design qualitative evaluation workflows for AI systems

  • Implement automated quantitative evaluations at scale

  • Use online evals and analytics to inform iteration, rollback, and roadmap decisions

  • Diagnose how GenAI systems fail and where risks emerge

  • Design guardrails and evaluation strategies before release

  • Monitor AI behavior in production to detect and manage risks

  • Gain hands-on experience running an AI evals project

  • Learn how to design AI evals metrics that drive decisions

  • Apply Evaluation-Driven Development (EDD) to guide product iteration during development

Learn directly from Stella & Amy

Stella Liu

Stella Liu

Head of AI Applied Science

Shopify
Carvana
Harvard University
Amy Chen

Amy Chen

Cofounder at AI Evals & Analytics

System1
Uptake
Seasalt.ai
UCLA
University of Washington

Who this course is for

  • Product Leaders

    Building GenAI products and need a clear playbook for evaluation frameworks, success metrics, and shipping with confidence.

  • Data Scientists

    Leveling up with AI product skills: learn AI evals, LLM-as-a-judge, and design metrics for GenAI systems.

  • QA and Test Engineer

    Adapting traditional testing practices to GenAI systems.

What's included

Live sessions

Learn directly from Stella Liu & Amy Chen in a real-time, interactive format.

Your First AI Evals and Analytics Playbook

Create your first AI Evals playbook and apply it on your current projects.

Glossary Sheet

Master the terminology with clear definitions and practical examples for every key concept in AI Evals.

Lifetime access

Go back to course content and recordings whenever you need to.

Community of peers

Stay accountable and share insights with like-minded professionals.

Certificate of completion

Share your new skills with your employer or on LinkedIn.

Maven Guarantee

This course is backed by the Maven Guarantee. Students are eligible for a full refund up until the halfway point of the course.

Course syllabus

3 live sessions • 66 lessons

Week 1

Apr 20—Apr 26

    Module 0: What Is AI Evals?

    1 item

    Module 1: Getting Buy-in

    6 items

    Module 2: Framework Overview

    6 items

    💡 Topic: Evaluation Driven Development

    1 item

    Module 3: Evaluation Methodologies

    5 items

    Module 4: Qualitative Evaluations

    4 items

    Apr

    25

    Week 1 Live Session - Q&A, Meet and Share

    Sat 4/256:00 PM—7:00 PM (UTC)

Week 2

Apr 27—May 3

    Module 5: Quantitative Evals Fundamentals

    8 items

    Module 6: Quantitative Evals - Test Set Generation

    8 items

    Module 7: Quantitative Evals In Action

    5 items

    Workshop with Truesight

    1 item

    May

    2

    Week 2 Live Session - Q&A, Meet and Share

    Sat 5/26:00 PM—7:00 PM (UTC)

    💡 Topics

    7 items

Free resources

Schedule

Live sessions

4 hrs / week

    • Sat, Apr 25

      6:00 PM—7:00 PM (UTC)

    • Sat, May 2

      6:00 PM—7:00 PM (UTC)

    • Sat, May 9

      6:00 PM—7:00 PM (UTC)

Projects

1 hr / week

Async content

1 hr / week

Frequently asked questions

$1,399

USD

·
Apr 20May 11
Enroll