9 Days
·Cohort-based Course
End-to-end: orchestrate and deploy agentic Retrieval-Augmented Generation with LangGraph, FastAPI, and React frontend in 2 weeks.
9 Days
·Cohort-based Course
End-to-end: orchestrate and deploy agentic Retrieval-Augmented Generation with LangGraph, FastAPI, and React frontend in 2 weeks.
Previously at
Course overview
The Real-World AI Engineering Roadblocks You Face Today
👋 Prototype → Production Gap — Moving from a notebook demo to a secure, observable, multi-tenant service requires orchestration, evals, guardrails, and ops most teams lack.
👋 “Easy RAG” vs “Reliable RAG” — Anyone can retrieve-then-generate; making answers faithful, fresh, fast, and cost-controlled under real traffic is the hard part.
👋 Framework Overload — The ecosystem is noisy; you need clear criteria (maturity, extensibility, latency, cost) and reference patterns to choose confidently.
👋 It’s Software Engineering First — Success hinges on clean interfaces, tests, typed configs, tracing, CI/CD, and change management—not just prompts and models.
👋 From Laptop to 1M Users — Scaling demands streaming, batching, caching, autoscaling, and SLOs, or your p95 explodes and costs spiral.
How this course will help you
✅ Ship a real Agentic RAG app, not a demo — Stand up an end-to-end stack—LangGraph → FastAPI → React, that runs locally today and deploys via a clean, fork-and-ship monorepo.
✅ Make retrieval dependable, not lucky — Adopt schema-aware chunking, strong dense embeddings with sensible metadata filters, and context packing with citations so answers stay faithful, fresh, and concise.
✅ Harden agentic workflows — Design a typed LangGraph state and build nodes for rewrite → retrieve → rerank → synthesize → cite → safety-check, with retries and timeouts so plans don’t loop or stall.
✅ Scale the experience, not the headaches — Enable server-streaming in FastAPI, cap top-k, trim context budgets, and add early-exit rules; deploy with autoscaling so you can serve real traffic without infra fuss.
✅ See enough to fix things fast — Bake in structured logs (no vendor tracing), per-step timing counters, and UI breadcrumbs/citations to follow query → context → answer and spot common failure patterns quickly.
✅ Choose frameworks with confidence — Follow an opinionated reference architecture plus a simple choice rubric (maturity, extensibility, latency, cost, swap effort) so you know when to stick—and how to swap components without rewrites.
✅ Write maintainable RAG code — Use clean module boundaries (ingest / retrieve / rerank / synthesize), typed configs (Pydantic Settings), and sensible secrets/env management so your team can extend it safely.
You’ll walk away with
✨ A running Agentic RAG app (LangGraph + FastAPI + React) in a fork-and-ship monorepo.
✨ An ingestion/indexing pipeline with metadata, hybrid retrieval, and optional re-ranking.
✨ A chat UI with citations, source previews, and conversation memory that behaves.
✨ Deploy scripts and env templates to go live right after class.
✨ A framework choice memo + adapters to swap models/vector stores without starting over.
Bottom line: this isn’t a vitamin, it’s a blueprint you can put in production.
01
ML/LLM engineers ready to ship Agentic RAG: LangGraph + FastAPI + React, GCP deploy, citations, history-safe chat.
02
Full-stack/platform engineers adding RAG fast: opinionated stack, streaming FastAPI, React UI with citations, GCP deploy.
03
Founder/solutions architect proving value: grounded chat over your repos/docs, agentic flows, deploy to production.
The backend and LangGraph nodes are Python; basics help you focus on RAG patterns instead of syntax or environment setup.
Used for the minimal web UI and examples; JS fluency helps you send requests, handle streaming responses, and add simple interactions.
Orchestrate complex RAG pipelines with LangGraph and OpenAI API
Build a typed LangGraph that routes rewrite → retrieve → rerank → synthesize → cite → self-check with retries, timeouts, early-exit rules, and real tool calls—exposed as a clean HTTP API.
Build scalable asynchronous applications with FastAPI
Ship async FastAPI endpoints with server-streaming, well-typed request/response models, input validation, and sensible timeouts—ready to run locally and deploy to production.
Implement chatbot interfaces with React
Create a chat UI that shows citations and source previews, lets users scope queries, preserves safe chat history, and handles transient API errors gracefully.
Mitigate hallucinations with LLM judges, structured output, and context engineering
Cut errors via schema-aware chunking, dedupe and budgeted context packing, plus lightweight LLM checks and schema-constrained outputs to verify claims and enforce citations before responding.
Design effective LLM prompts for high-level control on generation output
Write prompts that steer behavior: system prompts, task decomposition, Pydantic/JSON-schema constraints, and clear rules for tone, citations, and safe refusals.
Develop end-to-end RAG applications using the software engineering best practices
Produce a maintainable codebase: clean module boundaries (ingest/retrieve/rerank/synthesize), typed configs, secrets/env management, reproducible local dev, and deploy that mirrors local.
Live sessions
Learn directly from Damien Benveniste in a real-time, interactive format.
Lifetime access
Go back to course content and recordings whenever you need to.
Community of peers
Stay accountable and share insights with like-minded professionals.
Certificate of completion
Share your new skills with your employer or on LinkedIn.
Maven Guarantee
This course is backed by the Maven Guarantee. Students are eligible for a full refund up until the halfway point of the course.
Sep
27
Sep
28
Oct
4
Oct
5
Welcome, my name is Damien Benveniste! After a Ph.D. in theoretical Physics, I started my career in Machine Learning more than 10 years ago.
I have been a Data Scientist, Machine Learning Engineer, and Software Engineer. I have led various Machine Learning projects in diverse industry sectors such as AdTech, Market Research, Financial Advising, Cloud Management, online retail, marketing, credit score modeling, data storage, healthcare, and energy valuation. Previously, I was a Machine Learning Tech Lead at Meta on the automation at scale of model optimization for Ads ranking.
I am now training the next generation of Machine Learning engineers.
Join an upcoming cohort
Cohort 1
$349
Dates
Payment Deadline
4-6 hours per week
Tuesdays & Thursdays
1:00pm - 2:00pm EST
If your events are recurring and at the same time, it might be easiest to use a single line item to communicate your course schedule to students
May 7, 2022
Feel free to type out dates as your title as a way to communicate information about specific live sessions or other events.
Weekly projects
2 hours per week
Schedule items can also be used to convey commitments outside of specific time slots (like weekly projects or daily office hours).
Active hands-on learning
This course builds on live workshops and hands-on projects
Interactive and project-based
You’ll be interacting with other learners through breakout rooms and project teams
Learn with a cohort of peers
Join a community of like-minded people who want to learn and grow alongside you
Sign up to be the first to know about course updates.
Join an upcoming cohort
Cohort 1
$349
Dates
Payment Deadline