Making LLM Agents Observable & Debuggable

Hosted by Hugo Bowne-Anderson and Claire Longo

1,509 students

In this video

What you'll learn

How to debug and monitor agent behaviour in real-time

LLM agents fail silently, hallucinate, and drift: learn to catch issues early with output checks, trace logs, & metrics.

Work with human annotations and LLM's-as-a-judge

Use humans and LLMs to evaluate outputs with real-world workflows and practical examples you can apply immediately.

Using MCPs to level-up your vibe coding with telemetry

Give your IDE eyes and ears using Opik MCP to add telemetry and metrics, so you can spot and fix AI issues fast.

Start building today with open-source cookbooks

Get hands-on examples that work across LLMs and agent frameworks—apply these methods in your stack right away.

Why this topic matters

As LLM agents take on complex tasks—long chats, memory, multi-step tools—traditional model evals fall short. Failures go undetected, costing time, trust, and money. Opik is an open-source platform that brings observability to agents: test behavior, trace actions, and improve performance continuously. Learn how to debug smarter and ship more reliable AI systems.

You'll learn from

Hugo Bowne-Anderson

Podcaster, Educator, DS & ML expert

Hugo Bowne-Anderson is an independent data and AI consultant with extensive experience in the tech industry. He is the host of the industry Vanishing Gradients, where he explores cutting-edge developments in data science and artificial intelligence. As a data scientist, educator, evangelist, content marketer, and strategist, Hugo has worked with leading companies in the field. His past roles include Head of Developer Relations at Outerbounds, a company committed to building infrastructure for machine learning applications, and positions at Coiled and DataCamp, where he focused on scaling data science and online education respectively. Hugo's teaching experience spans from institutions like Yale University and Cold Spring Harbor Laboratory to conferences such as SciPy, PyCon, and ODSC. He has also worked with organizations like Data Carpentry to promote data literacy. His impact on data science education is significant, having developed over 30 courses on the DataCamp platform that have reached more than 3 million learners worldwide. Hugo also created and hosted the popular weekly data industry podcast DataFramed for two years. Committed to democratizing data skills and access to data science tools, Hugo advocates for open source software both for individuals and enterprises.

Claire Longo

AI Researcher at Comet | Mathematician | Startup Advisor | ex-Arize AI 📈

Claire is an AI leader, a MLOps and data science practitioner, and an advocate and mentor for women in the AI industry. She started her career as a statistician and worked her way up to a data scientist role at Trunk Club where she specialized in building personalization algorithms and outfit recommenders. After personally feeling the challenges of bringing models to production, she focused her career on MLOps and LLMOps best practices. She moved to Twilio and then Opendoor where she built and led engineering teams and platform teams focused on researching and deploying models at scale. She holds a master’s in statistics and a bachelor’s in applied mathematics.