Going Further: Late Interaction Beats Single Vector Limits

Hosted by Antoine Chaffin, Hamel Husain, and Shreya Shankar

4,804 students

In this video

What you'll learn

Understand the limitations of single vector models

Explore why traditional single vector approaches fall short on the challenges of modern search applications and evaluati

Discover multi-vector models to overcome these limitations

Learn how multi-vector architectures solve the fundamental problems of single vector systems and deliver better results.

Train and use cutting-edge multi-vector models with PyLate

Build expertise with the PyLate library through examples to train and evaluate your own state-of-the-art models.

Why this topic matters

Single vector search is the standard for RAG pipelines, but struggles in real-world applications due to poor out-of-domain generalization and long-context handling. Multi-vector models overcome these limitations and show strong performance on modern retrieval tasks, including reasoning-intensive retrieval. PyLate enables easy switching with sentence-transformers-like syntax.

You'll learn from

Antoine Chaffin

R&D Machine Learning Engineer at LightOn

Antoine is an R&D Machine Learning Engineer currently working at LightOn. During his thesis, he explored guiding generative models to create better synthetic data and train multimodal retrieval models to fight misinformation.

After joining LightOn, he has focused on Information Retrieval, notably by co-leading the ModernBERT project and co-creating PyLate, a library to train and experiment with multi-vector retrieval, which lead to state-of-the-art models such as GTE-ModernColBERT and Reason-ModernColBERT. Antoine also continues to work on multimodal projects, notably by the creation of OCR-free retrieval pipelines and visual document rerankers such as MonoQwen.

Hamel Husain

ML Engineer with 20 years of experience.

Hamel is a machine learning engineer with over 20 years of experience. He has worked with innovative companies such as Airbnb and GitHub, which included early LLM research used by OpenAI, for code understanding. He has also led and contributed to numerous popular open-source machine-learning tools. Hamel is currently an independent consultant helping companies build AI products.

Shreya Shankar

ML Systems Researcher Making AI Evaluation Work in Practice

Shreya is an experienced ML Engineer who is currently a PhD candidate in computer science at UC Berkeley, where she builds systems that help people use AI to work with data effectively. Her research focuses on developing practical tools and frameworks for building reliable ML systems, with recent groundbreaking work on LLM evaluation and data quality. She has published influential papers on evaluating and aligning LLM systems, including "Who Validates the Validators?" which explores how to systematically align LLM evaluations with human preferences.

Prior to her PhD, Shreya worked as an ML engineer in industry and completed her BS and MS in computer science at Stanford. Her work appears in top data management and HCI venues including SIGMOD, VLDB, and UIST. She is currently supported by the NDSEG Fellowship and has collaborated extensively with major tech companies and startups to deploy her research in production environments. Her recent projects like DocETL and SPADE demonstrate her ability to bridge theoretical frameworks with practical implementations that help developers build more reliable AI systems.

Share this lesson

4,804 students

Share this lesson

4,804 students

Go deeper with a course

Featured in Lenny’s List

AI Evals For Engineers & PMs