From Text-RAG to Vision-RAG

Hosted by Jason Liu and Nils Reimers

Wed, Oct 22, 2025

5:00 PM UTC (1 hour)

Virtual (Zoom)

Free to join

Invite your network

Go deeper with a course

Systematically Improving RAG Applications
Jason Liu
View syllabus

What you'll learn

Build End-to-End Vision RAG Systems

Learn to architect complete multimodal RAG pipelines that process both text and visual content effectively.

Implement Vision Embeddings & LLMs

Master techniques to integrate cutting-edge vision models for understanding charts, graphs, and images.

Bridge Text-Visual Modality Gaps

Develop strategies to seamlessly combine textual and visual information retrieval for enterprise applications.

Why this topic matters

Most enterprise data is visual (charts, diagrams, infographics), but current RAG systems miss this valuable information. Vision-RAG unlocks this untapped potential, dramatically expanding AI capabilities. Mastering this emerging field positions you for high-value opportunities in enterprises seeking comprehensive multimodal data solutions.

You'll learn from

Jason Liu

Consultant at the intersection of Information Retrieval and AI

Jason has built search and recommendation systems for the past 6 years. He has consulted and advised a dozens startups in the last year to improve their RAG systems. He is the creator of the Instructor Python library.

Nils Reimers

VP AI Search, Cohere.com

Nils started training his first neural network 20 years ago and has been a pioneer in the field of deep learning. At Cohere, he is in charge of training foundational models for finding the right information for AI systems, creating several of the world's best AI retrieval models.

worked with

Cohere
Stitch Fix
Meta
University of Waterloo
New York University

Sign up to join this lesson

By continuing, you agree to Maven's Terms and Privacy Policy.

© 2025 Maven Learning, Inc.