Tue, Jun 16, 2026
4:00 PM UTC (45 minutes)
Virtual (Zoom)
Free to join
Tue, Jun 16, 2026
4:00 PM UTC (45 minutes)
Virtual (Zoom)
Free to join
What you'll learn
Why agent evals can't wait
Skip this and you'll scale problems, not performance. Here's what breaks and how to spot it early.
A starting eval framework for AI product teams
Walk away with a practical structure you can apply to agent-driven workflows this week.
A 3-6 month maturity path for your eval practice
See how teams evolve from basic checks to a systematic, scalable measurement process.
Why this topic matters
Agent evals are the difference between AI features that ship with confidence and ones that quietly degrade user experience at scale. Teams that build this muscle early will move faster, catch failures before users do, and have the data to make the case for what to build next. This session will give you a practical system for measuring AI agent performance.
You'll learn from
Vinay Goel
Staff AI Engineer at Amplitude
Vinay is a Staff AI Engineer at Amplitude. He builds the foundational AI platforms that empower internal innovation and help define the future of AI analytics.
Ken Kutyn
Solutions Engineer at Amplitude with 10+ years in experimentation & AI evals.