Agent Evals 101: Best Practices for AI Product Teams

Hosted by Vinay Goel and Ken Kutyn

Tue, Jun 16, 2026

4:00 PM UTC (45 minutes)

Virtual (Zoom)

Free to join

Invite your network

What you'll learn

Why agent evals can't wait

Skip this and you'll scale problems, not performance. Here's what breaks and how to spot it early.

A starting eval framework for AI product teams

Walk away with a practical structure you can apply to agent-driven workflows this week.

A 3-6 month maturity path for your eval practice

See how teams evolve from basic checks to a systematic, scalable measurement process.

Why this topic matters

Agent evals are the difference between AI features that ship with confidence and ones that quietly degrade user experience at scale. Teams that build this muscle early will move faster, catch failures before users do, and have the data to make the case for what to build next. This session will give you a practical system for measuring AI agent performance.

You'll learn from

Vinay Goel

Staff AI Engineer at Amplitude

Vinay is a Staff AI Engineer at Amplitude. He builds the foundational AI platforms that empower internal innovation and help define the future of AI analytics.

Ken Kutyn

Solutions Engineer at Amplitude with 10+ years in experimentation & AI evals.

Amplitude
Included Health
Optimizely
Oracle
See all products from Amplitude AI Analytics

Sign up to join this lesson

By continuing, you agree to Maven's Terms and Privacy Policy.