Evaluating AI Agents before Users Break Them

Hosted by Aki Wijesundara, PhD, Marc Klingen, and Lotte Verheyden

69 students

In this video

What you'll learn

How to tell if an AI agent is actually helping users

Learn what “working” actually means from a product perspective.

What signals show problems early

Spot early signs of confusion, inconsistency, or user trust issues.

What to check before shipping or expanding an agent

Use a simple checklist to decide ship, fix, or stop.

How to talk about agent performance with your team

Ask the right questions without needing deep technical detail.

How to iterate toward production using Langfuse insights

Turn observed behavior into concrete improvements and safer deployments.

Why this topic matters

AI agents rarely fail loudly. They slowly degrade, behave inconsistently, or erode user trust. Many product teams lack clear ways to evaluate agent behavior before users are affected. This session covers practical evaluation frameworks PMs can use to understand agent behavior, spot risks early, and make confident product decisions without relying only on engineering intuition.

You'll learn from

Aki Wijesundara, PhD

AI Founder | Educator | Google AI Accelerator Alum

Aki Wijesundara is an AI leader with a PhD in Machine Learning and extensive experience mentoring startups at Google’s AI Accelerator. With a career spanning both research and applied AI, Aki has taught 5,000+ students worldwide how to design and deploy production-ready AI systems.

He has worked across cutting-edge areas of applied AI, from LangChain and RAG pipelines to observability and large-scale deployment. As a researcher and educator, Aki bridges the gap between theory and practice, making complex systems approachable and actionable for engineers, founders, and product leaders.

Aki is also a frequent speaker and advisor to organizations adopting AI, helping them transition from experimentation to production at scale.

Career highlights

  • Ex–Google AI Accelerator researcher focused on responsible AI and applied ML.
  • PhD in AI & Cognitive Systems with published research across top universities.
  • Former researcher with teams affiliated with MIT, University of Oxford, & King’s College London.
  • Co-founder of Snapdrum — delivered AI systems for finance, education, and healthcare.
  • Built and deployed AI product pipelines used by PMs, startups, and enterprise teams.
  • Instructor for multiple AI builder programs, helping 500+ professionals ship AI features fast.


Marc Klingen

Co-Founder & CEO of Langfuse

Marc Klingen is a technology entrepreneur, engineer, and the Co-Founder & CEO of Langfuse, an open-source LLM engineering platform that helps developers build, debug, monitor, and improve large language model (LLM) applications with observability, prompt management, evaluations, and analytics at scale. Langfuse was part of Y Combinator’s Winter 2023 batch and has seen rapid adoption across developers and enterprise teams.

At Langfuse, Marc leads product vision, engineering, and go-to-market strategy, bringing together full-stack development, product leadership, and business intelligence experience from both large organizations and early-stage environments.

Under his leadership, Langfuse has grown into one of the most widely adopted LLM observability platforms globally — used by thousands of developers and trusted by major companies. In January 2026, Langfuse was acquired by ClickHouse, expanding its reach and integration into broader data infrastructure stacks for AI engineering tools.

Marc holds a Master’s degree in Management and Computer Science from the Technical University of Munich, graduating within the top 1% of his class. His background spans product development, engineering, and AI tooling — equipping teams to build resilient, production-ready AI systems. 

Career highlights

  • Co-Founder & CEO of Langfuse — built a leading open-source LLM engineering platform used globally.
  • Led Langfuse through Y Combinator’s Winter 2023 batch and a successful acquisition by ClickHouse in 2026.
  • Experienced in full-stack engineering, product strategy, and developer tooling across startup and established tech environments.
  • Master’s in Management and Computer Science, Technical University of Munich — graduated in the top 1% of his cohort. 

Lotte Verheyden

Developer Relations at Langfuse

Lotte Verheyden is a Developer Relations specialist at Langfuse, the open-source LLM engineering platform that helps teams build, monitor, evaluate, and improve production-grade large language model (LLM) applications. Langfuse provides observability, prompt management, evaluations, and analytics tooling that accelerates real-world LLM workflows.

At Langfuse, Lotte’s work focuses on bridging the gap between the product and the developer community — helping users understand, adopt, and make the most of Langfuse’s capabilities through documentation, community engagement, and developer outreach. Her role spans supporting educational resources and fostering active participation in the open-source ecosystem around the platform.

Lotte is passionate about empowering developers and technical audiences to build robust, scalable AI systems using modern best practices in LLM engineering. She frequently contributes to Langfuse’s documentation and community touchpoints, helping teams adopt observability, prompt management, and evaluation workflows that improve iterative development.

Career highlights

  • Developer Relations at Langfuse, working to grow community engagement and product adoption for an open-source LLM engineering platform.
  • Supports documentation, developer workflows, and community events to drive deeper technical understanding and usability of Langfuse tooling.
  • Advocates for open-source collaboration and developer-centric product experiences in the evolving AI engineering ecosystem. 

Go deeper with a course

AI Builder Bootcamp for Product People
Aki Wijesundara, PhD and Manu Jayawardana
View syllabus