Latency First: How to Actually Make RAG & Agents Fast

Hosted by Jason Liu and Aarush Sah

Wed, Jun 4, 2025

5:00 PM UTC (1 hour)

Virtual (Zoom)

Free to join

59 students

Invite your network

Go deeper with a course

Systematically Improving RAG Applications
Jason Liu
View syllabus

What you'll learn

Measure RAG & Agent Performance Effectively

Students will learn to distinguish between TTFT, TPS, and step latency to benchmark AI systems.

Identify Latency Bottlenecks in AI Pipelines

Students will learn to diagnose slowdown points in RAG workflows and multi-step agents.

Apply Practical Optimization Techniques

Students will master stack-agnostic strategies to reduce response times while maintaining high-quality AI outputs.

Why this topic matters

Latency is the silent killer of AI adoption. Users abandon systems that make them wait, regardless of accuracy. By mastering performance optimization, you'll deliver solutions people actually use, overcome the primary barrier to production AI success, and develop a professional edge that distinguishes you in a market fixated on capability rather than usability.

You'll learn from

Jason Liu

Consultant at the intersection of Information Retrieval and AI

Jason has built search and recommendation systems for the past 6 years. He has consulted and advised a dozens startups in the last year to improve their RAG systems. He is the creator of the Instructor Python library.

Aarush Sah

Head of Evals, Groq

Worked with

Groq
Stitch Fix
Meta
University of Waterloo
New York University

Learn directly from Jason Liu and Aarush Sah

By continuing, you agree to Maven's Terms and Privacy Policy.

© 2025 Maven Learning, Inc.