AI Evals for PMs Certification

Marily Nika, Ph.D AI/ML

GenAI Product Builder @ Google, ex-Meta

Mark Cramer

Sr. ML Product Manager, ex-Meta

+ George Zoto and Diego Granados

Eliminate uncertainty in shipping AI features

“Does it work?”... “Is it good enough?”... “Can we ship it?”...

How do you answer these questions for AI products? You’re responsible for “running evals” but what does that mean?

How do you choose the right metrics, interpret fuzzy results, and make a confident decision?

This course gives you a framework to do just that.

  • Map user value to evaluation (eval) objectives so your metrics aren’t abstract. Define success then translate it into measurable criteria.

  • Choose metrics you can actually maintain: capability, safety, UX friction, latency, cost and “does this reduce support tickets or increase activation.”

  • Set ship/no-ship thresholds you can defend to leadership.

  • Build lightweight workflows that work in real teams: human review where it matters, automation where it lasts, documentation that drives decisions.

  • Consider domain constraints (e.g., healthcare safety) and know what to avoid: silent failures, misleading proxy metrics and tests that don’t reflect production.

  • Tie everything to ROI: impact vs unit cost, eval coverage vs reliability, and the minimum viable monitoring you need post-launch.

Experience AI evals through a case-based approach with a real AI product that we evaluate together.

What you’ll learn

Acquire and develop a critical skill for product managers who are leading and contributing to AI products.

  • Learn a repeatable framework for deciding when an AI feature is ready to launch.

  • Tie decisions to user value, business goals, and measurable evaluation criteria.

  • Turn fuzzy product goals into concrete eval objectives and measurable success criteria.

  • Define “good enough” in plain language before choosing metrics or tools.

  • Use a PM-friendly menu of metrics to avoid misleading proxies and anchor on business value.

  • Balance capability, latency, UX friction, and cost without being an ML engineer.

  • Create ship/no-ship thresholds tied to KPIs, risk, and user impact.

  • Know when to stop tweaking prompts and when launch should be paused.

  • Learn what to automate, what to review manually, and how to design sustainable processes.

  • Produce datasets, golden examples, and error taxonomies your team can reuse.

  • Understand risks in sensitive domains like healthcare and finance.

  • Avoid silent failures, weak proxies, and tests that don’t reflect production.

Learn directly from expert instructors

Marily Nika, Ph.D AI/ML

Marily Nika, Ph.D AI/ML

GenAI Product Builder @ Google, ex-Meta

Mark Cramer

Mark Cramer

Sr. ML Product Manager @ Meta

Meta
Stanford University
MIT
George Zoto

George Zoto

Senior NLP/AI Scientist

SHI International
National Association of REALTORS
Diego Granados

Diego Granados

Product Manager AI&ML @ Google

Google
LinkedIn
Microsoft
Cisco

Who this course is for

  • PMs leading AI features, growth, or platform initiatives

  • PMs who partner with ML teams and want to set evaluation standards

  • PMs who need to make clear “ship or hold” calls without doing the engineering

What's included

Live sessions

Learn directly from your instructors in a real-time, interactive format.

Lifetime access

Go back to course content and recordings whenever you need to.

Community of peers

Stay accountable and share insights with like-minded professionals.

Certificate of completion

Share your new skills with your employer or on LinkedIn.

Maven Guarantee

This course is backed by the Maven Guarantee. Students are eligible for a full refund up until the halfway point of the course.

Course syllabus

Week 1

Dec 4—Dec 7

    Dec

    4

    AI Evals for PMs - AI Evals and You

    Thu 12/45:00 PM—6:30 PM (UTC)

Week 2

Dec 8—Dec 14

    Dec

    9

    AI Evals for PMs - Foundations, Scoping, and Datasets

    Tue 12/95:00 PM—6:30 PM (UTC)

    Dec

    11

    AI Evals for PMs - Metrics, Thresholds, and Ship-readiness

    Thu 12/115:00 PM—6:30 PM (UTC)

Schedule

Live sessions

3 hrs / week

    • Thu, Dec 4

      5:00 PM—6:30 PM (UTC)

    • Tue, Dec 9

      5:00 PM—6:30 PM (UTC)

    • Thu, Dec 11

      5:00 PM—6:30 PM (UTC)

Projects

2 hrs / week

Async content

1 hr / week

Frequently asked questions

Save 25% until Monday

$2,000

USD

Dec 4Dec 19
Enroll