AI Evals and Analytics Playbook

Stella Liu

Head of Applied Scientist, AI Evals

Amy Chen

Co-Founder at AI Evals and Analytics

This course is popular

3 people enrolled last week.

Build Trust for Your AI Product With Evaluation and Analytics

👉 Why This Matters

In a landscape saturated with AI marketing claims, trust and proof are what set great products apart.

If you are building products in healthcare, legal, education, or any other highly specialized industry, you know you cannot rely on “vibe checks.” You need evidence and defensible proof to ensure safety, accuracy, compliance, and reliability.

AI systems are inherently probabilistic. What truly differentiates your product is how well you evaluate, monitor, and iteratively improve your AI system.

👉 The Solution

Pre-release evaluations and post-release analytics give you the evidence you need to validate performance, guide iteration, and earn user trust.

👉 About This Course

This is NOT a coding course.

It is a methodology and framework course for product and data leaders building AI products that will stand beyond the AI hype.

You will apply the battle-tested AI Evals & Analytics Playbook to your own real project, combining:

  • Automated evaluations for scalability

  • Human reviews for nuance

  • Analytics for real-world reliability

By the end of this course, you will have an actionable, organization-ready evaluation and analytics plan tailored to your own AI product.

What you’ll learn

Build trust and proof for your AI product through pre-release testing, in-production monitoring, and analytics.

  • 4 sessions (2 hours each) over 2 weeks

  • Build your first AI Evals and Analytics playbook for your current product use case

  • Create clear ownership: who writes rubrics, validates metrics, and holds veto power

  • Keep Legal, Trust & Safety, and SMEs aligned without evaluation becoming a bottleneck

  • Run sniff tests and build quantitative evaluation pipelines

  • Design experiments that prove your AI beats the baseline

  • Know when to use human labels vs. LLM-as-a-judge

  • Handle stochastic outputs and subjective quality with proper experiment design

  • Choose the right methodology, set sample sizes, and define guardrails that protect business metrics

  • Set up leading indicators (retry rates, confidence scores) and lagging metrics (CSAT, cost)

  • Build escalation procedures and run structured post-launch reviews at 15, 30, and 60 days

Learn directly from Stella & Amy

Stella Liu

Stella Liu

Head of Applied Scientist, AI Evals

Shopify
Carvana
Harvard University
Arizona State University
Amy Chen

Amy Chen

Co-founder, AI Evals and Analytics

System1
Uptake
Seasalt.ai
UCLA
University of Washington

Who this course is for

  • Product Leaders

    Building AI products and need a clear playbook for evaluation frameworks, success metrics, and shipping with confidence.

  • Data Leaders

    Redefining team scope and structure in the AI era, aligning evaluation, analytics, and accountability.

  • Seasoned Data Scientists

    Leveling up with AI product skills: learn AI evals, LLM-as-a-judge, and design AI-specific performance metrics.

What's included

Live sessions

Learn directly from Stella Liu & Amy Chen in a real-time, interactive format.

Your First AI Evals and Analytics Playbook

Create your first AI Evals playbook and apply it on your current projects.

Glossary Sheet

Master the terminology with clear definitions and practical examples for every key concept in AI Evals.

Lifetime access

Go back to course content and recordings whenever you need to.

Community of peers

Stay accountable and share insights with like-minded professionals.

Certificate of completion

Share your new skills with your employer or on LinkedIn.

Maven Guarantee

This course is backed by the Maven Guarantee. Students are eligible for a full refund up until the halfway point of the course.

Course syllabus

4 live sessions • 7 lessons

Week 1

Jan 17—Jan 18

    Getting Ready for Session 1

    1 item

    Jan

    17

    Session 1: AI Evaluations - Why & What

    Sat 1/177:00 PM—9:00 PM (UTC)

    Getting Ready for Session 2

    1 item

    Jan

    18

    Session 2: How to Evaluate AI Products

    Sun 1/187:00 PM—9:00 PM (UTC)

Week 2

Jan 19—Jan 25

    Getting Ready for Session 3

    1 item

    Jan

    24

    Session 3: Pre-release Evals Cont. + Experiments

    Sat 1/247:00 PM—9:00 PM (UTC)

    Getting Ready for Session 4

    1 item

    Jan

    25

    Session 4: Monitoring & Analytics + Evals Tools

    Sun 1/257:00 PM—9:00 PM (UTC)

Free lesson

Build Your AI Evals & Analytics Playbook cover image

Build Your AI Evals & Analytics Playbook

Understand What Is AI Evals and Why It Matters Now

It's part of your PRD. AI Evals helps you ship products confidently and safely.

Learn Core AI Evals Framework

Learn how to define metrics, test model quality, and align stakeholders from Legal to Trust & Safety.

Get Started with Your Evals Playbook

Understand how to start your first evals plan and start building with best Evals practice

Schedule

Live sessions

4 hrs / week

    • Sat, Jan 17

      7:00 PM—9:00 PM (UTC)

    • Sun, Jan 18

      7:00 PM—9:00 PM (UTC)

    • Sat, Jan 24

      7:00 PM—9:00 PM (UTC)

    • Sun, Jan 25

      7:00 PM—9:00 PM (UTC)

Projects

1 hr / week

Async content

1 hr / week

Frequently asked questions

Save 25% until Monday

$2,250

USD

·
Jan 17Feb 3
·

2 cohorts