Evaluating Agentic AI Applications Beyond Vibe Checks

Free Lesson

Evaluating Agentic AI Applications Beyond Vibe Checks

Part of The AI Evaluation Handbook

•

Hosted by Aishwarya Naresh Reganti, Kiriti Badam, and Claire Longo

1,258 students

In this video

What you'll learn

Why Agentic AI Evaluation ?

Traditional evaluation methods fall short for agentic systems due to their autonomous and non-deterministic nature

Types of Evaluations for Agentic Systems

We’ll break down key evaluation types needed for robust, real-world agent behavior.

Live Demo: Evaluating with Comet Opik

Watch a hands-on demo using Comet Opik to run, monitor, and visualize agent evaluations in a production-like setup.

Why this topic matters

As agentic applications become more widespread, evaluating their effectiveness is more critical than ever. Join us to learn how to run meaningful evaluations—plus a live demo using Comet Opik, an open-source platform built for this purpose.

You'll learn from

Aishwarya Naresh Reganti

Tech Lead @ AWS | AI Lecturer & Advisor

Aishwarya Naresh Reganti is an Applied Science Tech Lead at the AWS Generative AI Innovation Center (GenAIIC), where she leads initiatives to develop and deploy production-ready generative AI solutions for AWS clients. With over 9 years of experience in machine learning, she has published more than 35 research papers at top-tier AI conferences, including NeurIPS, AAAI, and CVPR.

Aishwarya has taught professional courses on generative AI at renowned institutions like MIT and Oxford. She has also designed courses that have reached over 10,000 students globally and have formed the foundation for several academic programs and industry training curricula.

Recognized as one of the most prominent voices in enterprise AI, with over 90,000 professionals following her on LinkedIn, she is a sought-after thought leader frequently invited to speak at leading conferences and events, including TEDx, MLOps World, and ReWork.

Aishwarya actively collaborates with leading research professors and provides strategic advisory to organizations, enabling them to harness AI effectively to address complex business challenges.

Kiriti Badam

Member of Technical Staff @ OpenAI | Ex-Google

Kiriti Badam is a member of the technical staff at OpenAI, with over a decade of experience designing high-impact enterprise AI systems. He specializes in AI-centric infrastructure, with deep expertise in large-scale compute, data engineering, and storage systems.

Prior to OpenAI, Kiriti was a founding engineer at Kumo.ai, a Forbes AI 50 startup, where he led the development of infrastructure that enabled training hundreds of models daily—driving significant ARR growth for enterprise clients.

Kiriti brings a rare blend of startup agility and enterprise-scale depth, having worked at companies like Google, Samsung, Databricks, and Kumo.ai. At Google Ads, he built globally distributed key-value stores that powered ad systems generating tens of billions in annual revenue.

He holds a Master’s degree from Carnegie Mellon University and a Bachelor’s from IIT Madras, where his research focused on advanced storage systems and distributed databases for AI workloads. A sought-after mentor and advisor, Kiriti helps startups and organizations design scalable AI infrastructure, reach product-market fit, and build long-term product strategy.

Claire Longo

Lead AI Researcher @ Comet

Claire Longo is an AI leader with over a decade of experience in Data Science, Machine Learning, and MLOps. From coding in R as a Statistician at DOE National Laboratories to building recommender systems at Trunk Club, she has navigated the evolving AI landscape—teaching herself Python and ML along the way. She has led cross-functional AI teams at Twilio, Opendoor, and Arize AI, focusing on the engineering best practices required to bring AI models from ideation to production at scale. Currently at Comet, Claire researches AI trends and shares best practices with the developer community. She holds a Bachelor’s in Applied Mathematics and a Master’s in Statistics from The University of New Mexico.

See all products from Kiriti & Aish

Share this lesson

1,258 students

Share this lesson

1,258 students

Go deeper with a course

Featured in Lenny’s List

Building Agentic AI Applications with a Problem-First Approach