Scaling Late Interaction to Billions of Documents
Hosted by Marek Galovic, Hamel Husain, and Isaac Flath
Thu, Jul 9, 2026
6:00 PM UTC (45 minutes)
Virtual (Zoom)
Free to join
Go deeper with a course

Thu, Jul 9, 2026
6:00 PM UTC (45 minutes)
Virtual (Zoom)
Free to join
Go deeper with a course

What you'll learn
Why single-vector retrieval loses detail
What late interaction changes
Why late interaction is expensive
How we scaled it to billions of document
How production constraints shape retrieval
Why this topic matters
You'll learn from
Marek Galovic
CEO, Co-Founder @TopK. ex-Pinecone, ex-Shopify
Marek is the CEO and co-founder of TopK - an AI-native search engine. Before founding TopK, Marek led data/control plane engineering teams at Pinecone and worked on fraud detection and financial forecasting at Shopify. He holds a degree in computer science and artificial intelligence from CTU Prague, where he researched game theory and adversarial machine learning algorithms applied to computer security (published at NeurIPS).
Hamel Husain
ML Engineer with 25+ years of experience
Hamel Husain is a ML Engineer with over 20 years of experience. He has worked with innovative companies such as Airbnb and GitHub, which included early LLM research used by OpenAI, for code understanding. He has also led and contributed to numerous popular open-source machine-learning tools. Hamel is currently an independent consultant helping companies build AI products.
Isaac Flath
AI product engineer, 10 years of experience in AI.
I’m an AI and product engineer building systems that work with private knowledge and support real workflows. I’ve taught people how to use AI, from a Boot.dev RAG course to live courses on AI-assisted development. I’ve also helped teams improve AI products, tools, and workflows from AnkiHub (collaborative learning tools) and SpecStory (agentic software) to enterprise companies like Travel + Leisure and General Mills.
These days I focus on context-first AI systems. In practice, that means helping teams see and improve the parts of the system that decide what the system can use: retrieval, memory, tool use, evals, traces, harnesses, and the product interface around them. I help teams find where the process bottlenecks, whether the problem is search, agent behavior, workflow design, or the human interface, and then fix that layer.
Previously at