Evals in Action With Arize
Hosted by Laurie Voss
In this video
What you'll learn
Build your first LLM-as-a-Judge evaluator
Write an eval that detects hallucinations in <10 minutes using Arize Phoenix's templates and your own custom criteria.
Trace your AI system end-to-end
Add observability to any LLM application so you can see exactly what's happening at every step, from input to output.
Choose the right evaluator for each failure mode
Learn when to use code-based checks, LLM judges, or human annotations based on what you're trying to catch.
Why this topic matters
You've learned why evals matter and what to measure. Now you need to actually build them. Most teams get stuck here because the gap between "understanding evals" and "shipping evals" feels enormous. This hands-on session bridges that gap with live code, real tools, and templates you can steal. You'll leave with working evaluators, not just concepts.
You'll learn from
Laurie Voss
Head of DevRel at Arize, co-founder, npm Inc
Laurie Voss is Head of Developer Relations at Arize AI, where he helps teams build better AI applications through observability and evaluation. Previously, he was VP of Developer Relations at LlamaIndex, Senior Data Analyst at Netlify, and co-founded npm, Inc (acquired by GitHub), where he served as COO and CTO. With 20+ years in developer tools and data analysis, Laurie brings a practical, code-first approach to AI evaluation.
Go deeper with a course
Featured in Lenny’s List
Building Agentic AI Applications with a Problem-First Approach


Aishwarya Naresh Reganti and Kiriti Badam
AI Founder | Lecturer | Advisor | Researcher . Applied AI Lead | AI Advisor | Ex-Google
Keep exploring



