Building AI-Native Products

Setting Eval for AI Agents &Scaling with Auto-Evaluation

Hosted by Mahesh Yadav

Fri, Jun 6, 2025

4:00 PM UTC (30 minutes)

Virtual (Zoom)

Free to join

Invite your network

Go deeper with a course

Agentic AI Product Management Certification
Mahesh Yadav
View syllabus

What you'll learn

How to evaluate non-deterministic outputs

Go beyond accuracy—learn practical ways to measure AI behavior when outcomes vary.

How to set success targets to launch

Define MVP-grade evaluation criteria to reduce risk and increase team alignment.

How to scale evaluation using auto-evaluators

Use tools like OpenAI function calls, prompt-based scoring, and test suites to automate quality checks.

Why this topic matters

AI outputs are unpredictable, making traditional testing unreliable. Without clear evaluation, teams can't iterate or launch confidently. Auto-evaluators enable scalable, automated feedback to track quality, reduce risk, and align stakeholders. This is essential for shipping reliable, production-ready AI products.

You'll learn from

Mahesh Yadav

Ex- GenAI Product Lead at MAANG Firms l AI PM Coach l 10k+ Alumni

Mahesh has 20 years of experience in building products at Meta, Microsoft, and AWS AI teams. Mahesh has worked in all layers of the AI stack, from AI chips to LL,M and has a deep understanding of how using AI agents companies ship value to customers. His work on AI has been featured at the Nvidia GTC conference, Microsoft Build, and Meta blogs. :

His mentorship has helped various students build real-time products & careers in the Agentic AI PM space.

Whether you're a hobbyist or professional looking to get a grasp on GenAI Product Management, feel free to join our channels for more such sessions


Previously at

Google
Amazon Web Services
Meta
Microsoft

Learn directly from Mahesh Yadav

By continuing, you agree to Maven's Terms and Privacy Policy.

© 2025 Maven Learning, Inc.