Systematically Improving RAG Applications

4.7

(29 ratings)

·

6 Weeks

·

Cohort-based Course

Follow a repeatable process to continually evaluate and improve your RAG application

Instructor Clients

Stitch Fix
Meta
Google

Course overview

Acquire the skills & confidently improving and iterate on RAG applications

Are you struggling to scale your RAG application beyond the prototype stage?


Feeling overwhelmed by competing priorities and limited resources?


In just 6 weeks, you'll learn to:


* Optimize search quality and latency

* Design robust feedback loops for continuous improvement

* Implement data-driven strategies for maximum impact


Why now?


- RAG is becoming essential for competitive AI integration

- Focus is shifting from basic implementation to performance optimization

- Core principles of effective RAG systems are crystallizing

- Early adopters gain significant market advantages

- Instructor's real-world experience provides immediately applicable insights


Won't a lot change by February?


- Course focuses on enduring principles, not just current tools

- You'll learn to evaluate and integrate new technologies rapidly

- Strategies taught focus on ongoing system optimization

- By February, you'll be positioned to leverage new developments immediately

- Course content will be updated to reflect any significant changes

- Skills developed are foundational and will remain relevant


Why does my team need this?


- Align your team on RAG best practices

- Save months of trial and error

- Build scalable systems that prevent future rewrites

- Foster a data-driven culture of continuous improvement

- Bridge gaps between technical and business teams

- Gain competitive edge in AI implementation

- Develop skills to justify AI investments to leadership

- Learn from real-world case studies across industries

- Connect with professionals facing similar challenges

- Acquire skills applicable to all AI initiatives, not just RAG


About the Instructor


Jason Liu is a machine learning engineer and data scientist with 8 years of experience in building recommendation systems and multi modal semantic search products at Stitchfix. Currently, he leads runs this own consulting studio where he's work with many companies training teams to build rag solutions across private equity, financial services, construction, web crawling, sales, marketing, personal agents with a focus on


* Designing self-improving AI systems with robust feedback loops, creating valuable data flywheels.

* Developing and implementing Vision/Text based search systems that integrate seamlessly into existing product ecosystems.

* Crafting evaluation frameworks and fine-tuning algorithms for search and recommendation systems.

* Making strategic AI research bets and evaluating vendors to drive innovation and scalable growth.



Week 0: Fundamental Biases in AI Engineering

Recognizing and Mitigating Blind Spots in Development


* Understand common biases in AI system development, such as intervention bias and absence blindness

* Learn techniques to identify and mitigate these biases in RAG systems

* Develop strategies for comprehensive system evaluation and blind spot detection

* Explore case studies of bias-related failures in AI systems and their remedies


Week 1: Kickstarting the Data Flywheel

Leveraging Synthetic Data for Evals and Data Augmentation


* Implement the RAG System Inference Flywheel concept

* Create robust evaluation pipelines using synthetic data

* Develop scalable datasets for continuous system improvement

* Master techniques for fast, iterative improvement cycles using precision and recall metrics

* Distinguish between leading and lagging metrics to set actionable goals


Week 2: Finetuning Search and Hard Negative Mining

Optimizing Representations for Precision and Recall


* Understand the principles of representation learning for search

* Implement techniques for hard negative mining to improve search quality

* Develop strategies for fine-tuning embedding models and re-rankers

* Learn to balance precision and recall in search optimization

* Explore advanced techniques like contrastive learning for search improvement


Week 3: Decomposing Query Types and Identifying Bottlenecks

Prioritizing Investments for Maximum Impact


* Implement classification systems for query segmentation

* Conduct thorough bottleneck analysis in RAG systems

* Apply data-driven approaches to detect concept drift and adapt systems dynamically

* Understand inventory vs capability segments and develop strategies for unsupervised topic discovery

* Create comprehensive dashboards for visualizing query patterns and system performance


Week 4: Navigating Multimodal RAG

Tailoring Approaches for Documents, Images, and Tables


* Design and implement search strategies for diverse content types (images, documents, tables, text-to-SQL)

* Develop metrics to measure and improve search quality across different modalities

* Learn techniques for extracting structured data from various content types

* Implement specialized indexing and retrieval methods for each content type

* Balance trade-offs between generalized and specialized approaches in multimodal RAG


Week 5: Efficient Routers and Index Fusion

Designing Scalable Systems for Complex Queries


* Build intelligent routers for multi-index RAG systems

* Implement sophisticated query understanding techniques

* Develop strategies for efficient index fusion and result aggregation

* Evaluate and balance trade-offs between search architectures for latency, cost, and accuracy


Week 6: Enhancing User Experience and Feedback Loops

Strategies for Latency Perception and Continuous Improvement


* Design RAG products that effectively collect user feedback

* Implement streaming strategies to improve perceived latency

* Create intuitive UI components for citations and user interaction

* Develop strategies for handling negative examples and continuously improving performance

* Implement validators and monologue techniques to enhance response quality


This comprehensive playbook will enable you to deliver consultant-level value, leading your team to results through structured experimentation.


💡 COURSE PREREQUISITES

You should NOT take this course if:

* You work on non-software products (e.g. hardware, pharmaceuticals, deep climate tech, defense tech, etc.)

* If you have not tried to build a RAG application in the past, this course is about improving systems as we move from prototype to production


Don't do it alone - be part of a small cohort of other teams shipping real applications.


You pay only after your application is approved.


Get these free bonuses (over $1500 in value):

• $500 Cohere credits (Jason uses Cohere rerankers in every single RAG product he's build or adviced)

• $200 LanceDB credits and free access to Lance Cloud

• $500 in Modal Labs credits (useful for experimenting with embedding fine-tuning)

• 6 months free Notion AI Plus (get experience with more RAG products)

• 3 months Braintrust access ($250 value)


🚀 Limited Spots Available 🚀


Our small-group cohorts fill up fast. Don't wait to level up your RAG skills.


Remember, you only pay after your application is approved. We're so confident in the value of this course that we offer a full refund if you don't see meaningful improvements in your processes within 5 weeks.


This Course Is For You If You Are

01

An Engineering or Product leader looking to improve an existing RAG system MVP

02

Solving problems like poor retrieval, unreliable outputs or unhappy customers with your existing application

03

Ready to lead your team in building a data flywheel so you can leverage feedback

By the end of this course, participants will be able to

Implement a systematic approach to developing and improving RAG applications using the Data and Evals Flywheel methodology.


Design and execute fast, unit test-like evaluations to assess retrieval capabilities, including precision and recall metrics.


Generate and utilize synthetic data for rapid evaluation and iteration of RAG systems.


Apply fine-tuning strategies for embedding models and implement hard negative mining techniques to enhance search relevance.


Classify different types of queries and conduct bottleneck analysis to identify performance limitations in RAG systems.

Differentiate between limited inventory and limited capabilities issues, and develop strategies to address both.


Design and implement specialized indices for various data types, including documents, images, tables, and SQL databases.

Apply synthetic text chunk generation and summarization techniques to improve retrieval performance across different modalities.

Develop efficient query routing systems and implement effective index fusion strategies for complex RAG setups.

Evaluate the performance of both query routing and individual indices separately to optimize overall system performance.

Design and integrate both explicit and implicit feedback mechanisms to drive continuous system improvement.

This course includes

16 interactive live sessions

Lifetime access to course materials

In-depth lessons

Direct access to instructor

Projects to apply learnings

Guided feedback & reflection

Private community of peers

Course certificate upon completion

Maven Satisfaction Guarantee

This course is backed by Maven’s guarantee. You can receive a full refund within 14 days after the course ends, provided you meet the completion criteria in our refund policy.

Course syllabus

Week 1

Feb 4—Feb 9

    Feb

    4

    Intro to the Playbook + RAG Evaluation

    Tue 2/46:00 PM—7:00 PM (UTC)

    Feb

    5

    Breakout + Office Hours

    Wed 2/56:00 PM—7:00 PM (UTC)
    Optional

    Feb

    6

    Guest Speaker session

    Thu 2/66:00 PM—7:00 PM (UTC)

Week 2

Feb 10—Feb 16

    Feb

    11

    Identifying Areas of Improvement

    Tue 2/116:00 PM—7:00 PM (UTC)

    Feb

    12

    Breakout + Office Hours

    Wed 2/126:00 PM—7:00 PM (UTC)
    Optional

    Feb

    13

    Guest speaker session

    Thu 2/136:00 PM—7:00 PM (UTC)

Week 3

Feb 17—Feb 23

    Feb

    18

    IR Keys + Non-Text Data

    Tue 2/186:00 PM—7:00 PM (UTC)

    Feb

    19

    Breakout + Office Hours

    Wed 2/196:00 PM—7:00 PM (UTC)
    Optional

    Feb

    20

    Guest Speaker Session

    Thu 2/206:00 PM—7:00 PM (UTC)

Week 4

Feb 24—Mar 2

    Feb

    25

    Routing Queries

    Tue 2/256:00 PM—7:00 PM (UTC)

    Feb

    26

    Breakout + Office Hours

    Wed 2/266:00 PM—7:00 PM (UTC)
    Optional

    Feb

    27

    Guest Speaker Session

    Thu 2/276:00 PM—7:00 PM (UTC)

Week 5

Mar 3—Mar 9

    Mar

    4

    Representations

    Tue 3/46:00 PM—7:00 PM (UTC)

    Mar

    5

    Breakout + Office Hours

    Wed 3/56:00 PM—7:00 PM (UTC)
    Optional

    Mar

    6

    Guest Speaker Session

    Thu 3/66:00 PM—7:00 PM (UTC)

Week 6

Mar 10—Mar 13

    Mar

    12

    Breakout + Office Hours

    Wed 3/125:00 PM—6:00 PM (UTC)
    Optional

Post-course

    Product Design

    5 items

    Rejecting work

    3 items

    Intro To The Playbook

    2 items

    RAG Evaluation

    4 items

    Synthetic Data

    1 item

    Identifying Areas of Improvement

    4 items

    Production Monitoring and Analysis

    0 items

    Improving Retrieval

    4 items

    Tables and Non-Text Data

    2 items

    Routing Queries

    4 items

    Representations

    0 items

    Synthetic Text Chunks

    3 items

4.7

(29 ratings)

What students are saying

Meet your instructor

Jason Liu

Jason Liu

Jason has built search and recommendation systems for the past 6 years. He has consulted and advised a dozens startups in the last year to improve their RAG systems. He is the creator of the Instructor Python library.

Dan Becker

Dan Becker

Dan has worked in AI since 2011, when he finished 2nd (out of 1350+ teams) in a Kaggle competition with a $500k prize. He contributed code to TensorFlow as a data scientist at Google and he has taught online deep learning courses to over 250k people. Dan has advised AI projects for 6 companies in the Fortune 100.

A pattern of wavy dots

Join an upcoming cohort

Systematically Improving RAG Applications

Cohort 2

$1,650

Dates

Feb 4—Mar 13, 2025

Application Deadline

Feb 1, 2025
Get reimbursed

Bulk purchases

Course Schedule Each Week

  • Tuesday: Workshops

    1:00 - 2:00PM ET

    Workshops covering each step of the playbook and helping you build process improvements in your RAG application

  • Wednesday: Office Hours + Breakout Sessions

    1:00 - 2:00PM ET

    The first half hour will be interactive breakout sessions, and the closing half-hour each week is Q&A

  • Thursday: Guest Speakers

    1:00 - 2:00PM ET

    Guest instructors covering key topics in both innovative theory and practical applications in RAG system development.

Learning is better with cohorts

Learning is better with cohorts

Active hands-on learning

This course builds on live workshops and hands-on projects

Interactive and project-based

You’ll be interacting with other learners through breakout rooms and project teams

Learn with a cohort of peers

Join a community of like-minded people who want to learn and grow alongside you

Frequently Asked Questions

What happens if I can’t make a live session?

I work full-time, what is the expected time commitment?

What’s the refund policy?

Is this course suitable for experienced machine learning researchers or statisticians?

Can I get reimbursed by my company?

Stay in the loop

Sign up to be the first to know about course updates.

A pattern of wavy dots

Join an upcoming cohort

Systematically Improving RAG Applications

Cohort 2

$1,650

Dates

Feb 4—Mar 13, 2025

Application Deadline

Feb 1, 2025
Get reimbursed

Bulk purchases

$1,650

4.7

(29)

·

6 Weeks