
Stella Liu
AI Evals Researcher & Practitioner | Head of AI Applied Science
This session is designed for AI product leaders, data scientists, and engineers who are building or improving an AI evaluation program.
It may be especially helpful if you are:
Starting an AI eval practice or team
Trying to align product, engineering, data science, and leadership
Designing an evaluation strategy for an AI product
Selecting metrics, datasets, tools, or frameworks
Figuring out how to move from vibe checking to a structured evaluation process that guides AI development.
Bring your most important AI eval questions. Depending on your needs, we can discuss:
AI eval strategy and roadmap
Team structure, roles, and resourcing
Stakeholder alignment and executive communication
Evaluation criteria, metrics, and rubrics
Test-set and dataset development
LLM-as-a-judge and human evaluation
RAG, agent, and multi-turn evaluation
Eval tooling and implementation architecture
Production monitoring and continuous evaluation
Diagnosing failures and prioritizing improvements
By the end of the session, you will have:
Clearer direction on your highest-priority eval challenges
Practical recommendations tailored to your product and organization
Actionable next steps for your team
Relevant frameworks, examples, or resources to support implementation
This is a working session, not a generic presentation. We will focus on your specific questions and context.
$500
USD
One-hour expert guidance on AI eval strategy, team setup, stakeholder alignment, and implementation.