Design Evals Users Will Trust

Hosted by Aishwarya Naresh Reganti

419 students

In this video

What you'll learn

Distinguish model evals from product evals

Learn why benchmarks that look impressive on paper often fail in production, and what to measure instead

Design your evaluation framework

Apply a three-component structure (reference datasets, metrics, and scoring methods) that scales with your AI product

Create reference datasets before launch

Build evaluation datasets that catch real failure modes, not just the obvious ones

Why this topic matters

Most AI teams ship products that pass benchmarks but fail users. The gap between "model works" and "product works" is where careers and products stall. This session gives you the systematic approach used by production AI teams at companies like OpenAI and Google to evaluate what actually matters before your users find the problems first.

You'll learn from

Aishwarya Naresh Reganti

AI Founder & Advisor to F500 Leaders

Aishwarya Naresh Reganti is an Applied Science Tech Lead and leads initiatives to develop and deploy production-ready generative AI solutions enterprise clients. With over 9 years of experience in machine learning, she has published more than 35 research papers at top-tier AI conferences, including NeurIPS, AAAI, and CVPR.

Aishwarya has taught professional courses on generative AI at renowned institutions like MIT and Oxford. She has also designed free courses that have reached over 8,000 students globally and have formed the foundation for several academic programs and industry training curricula.

Recognized as one of the most prominent voices in enterprise AI, with over 95,000 professionals following her on LinkedIn, she is a sought-after thought leader frequently invited to speak at leading conferences and events, including TEDx, MLOps World, and ReWork.

Aishwarya actively collaborates with leading research professors and provides strategic advisory to organizations, enabling them to harness AI effectively to address complex business challenges.

Go deeper with a course

Beyond Evals: Designing Improvement Flywheels for AI Products
Aishwarya Naresh Reganti
View syllabus