Build Custom Annotation UIs for AI Evals

Hosted by Shane Butler

Wed, Feb 11, 2026

8:00 PM UTC (30 minutes)

Virtual (Zoom)

Free to join

Invite your network

Go deeper with a course

AI Evals for Product Development
Shane Butler
View syllabus

What you'll learn

Identify where annotation fits in the AI evaluation lifecyle

Know when you need annotations for ground truth, production review, and continuous discovery.

Design annotation tasks that match the evaluator’s workload

Choose the right interface patterns so reviewers can judge quality without fighting the UI.

Define a lightweight “custom UI” blueprint that scales

Specify what the UI must show, what data it reads and writes, and how to standardize to make work is repeatable.

Why this topic matters

Human annotation is a core dependency for AI evaluation, but generic tools force reviewers into spreadsheets and trace views that destroy context. That raises cognitive load, slows reviews, and degrades label quality. Custom, task-specific annotation UIs make evaluation faster, more reliable, and scalable across products.

You'll learn from

Shane Butler

Principal Data Scientist, AI Evaluations at Ontra

Shane Butler is a Principal Data Scientist at Ontra, where he leads evaluation strategy for AI product development in the legal tech domain. He has more than ten years of experience in product data science and causal inference, with prior roles at Stripe, Nextdoor, and PwC. His current work focuses on practical, end-to-end methods for evaluating AI features in production. Shane is also the co-host of the AI podcast Data Neighbor, where he interviews product, data, and engineering leaders who are pioneering the next generation of data science and analytics in an AI-driven landscape.

Previously at Stripe, Nextdoor, PwC

Stripe
Nextdoor
PwC India
Ontra
AppFolio

Sign up to join this lesson

By continuing, you agree to Maven's Terms and Privacy Policy.