Production-Ready Agent Engineering: From MCP to RL

New
·

3 Weeks

·

Cohort-based Course

Gain the key skills for designing effective agents and optimizing their performance. Dive deep into evaluations, tools, MCP, and RL.

Course overview

Why should you take this course?

Modern teams are under pressure to ship LLM agents that actually work in production, but it's difficult to cut through the noise and determine what actually works. There's an ever-growing number of "agent frameworks" promising great results, yet whose abstractions are opaque and difficult to optimize. Blog posts and one-off repos explain pieces of the puzzle, but AI is moving faster than ever.


Many engineers struggle to:

- Choose the right agent pattern for their use case

- Incorporate reliable tool use into agentic workflows.

- Evaluate where and why agents fail.

- Deploy agents which optimize intelligence, cost, and latency.

- Understand when and how to improve agent performance with finetuning and RL.


We keep hearing that 2025 is the Year of the Agents. Everyone’s talking about MCP and A2A and GRPO but no one seems to agree on when you should use them. Agentic interactions are becoming table-stakes consumer features, and investors are eager to see that you’re keeping up with the times.


Popular agent products like Deep Research, Devin, and Manus are built by companies who don’t want to share their tricks. Open-source alternatives often underperform or are complex to understand and adapt. Textbooks don’t exist yet, and sifting through every new paper is basically a full-time job. The latest API models can make for powerful agents, but costs get out of control quickly. Few people outside of the big AI labs have hands-on expertise in optimizing LLM agents using reinforcement learning. Will and Kyle happen to be two of them.


---


What to expect:


Beyond core principles, this course emphasizes hands-on practice for building production-ready agents, including:

- How to integrate MCP tools for popular services like Notion, Linear, and Slack into your agent applications

- How to build your own MCP servers for custom APIs and data

- How to scaffold and prompt agents for complex tool workflows

- How to evaluate and interactively refine agents with human-in-the-loop prompting

- How to use rule-based and LLM-based evaluations as reward signals for RL or synthetic data filtering

- How to train cost-effective agents which outperform models like o3 at a fraction of the cost using GRPO


The course will have 2x weekly lectures for 3 weeks, and we will have additional sections for office hours (see schedule below). Lecture videos will be available to watch asynchronously, and we'll also have a Discord chat for offline discussions.


Lectures will incorporate live coding/prompting with tools like Cursor, Claude Code, and Jupyter notebooks. Familiarity with Python, high-level AI/ML concepts, and LLM APIs is assumed.


---


Course schedule:


Lecture 1 (6/17)

Agent Patterns and Principles

- ReAct, MemGPT, Agentic RAG, Multi-Agent (A2A)

- Hands-on demos with HF smolagents + other frameworks


Lecture 2 (6/19)

Model Context Protocol: When and Why

- Client/Server architectures for tool calls

- Approaches to auth

- Hands-on agentic MCP flow demos with Claude Desktop + Claude Code etc.


Lecture 3 (6/24)

Evals for Agents

- Extending eval techniques to agentic workflows

- Rule-based vs LLM-as-judge

- Filtering rollouts for synthetic data collection

- Brief demo of SFT on filtered rollouts


Lecture 4 (6/26)

Reinforcement Learning for Busy Engineers

- Crash course in RL fundamentals without the math

- GRPO vs DPO vs PPO

- Demo of GRPO for training a reasoning model (via HF TRL)


Lecture 5 (6/24)

Formulating Business Problems as RL Tasks

- How to think about reward/rubric design for real-world tasks

- Environment = Tasks + Tools + Verifiers

- Walkthrough of problem formulation for email search (via ART)


Lecture 6 (6/24)

Training Agents with GRPO

- Deep dive into RL experimentation for agent workflows (via ART)

- Broader ecosystem: other RL trainers + integrations with existing agent/tool frameworks (smolagents, MCP)


This is the course for you if you're:

01

A Senior SWE turned AI Engineer at a Series D SaaS company who's eager to replace brittle pipelines with highly-optimized agents

02

A Founder + CTO of a Series A startup who wants to offer a best-in-class agentic AI experience to discerning customers

03

A Technical Director at a Fortune 500 company responsible for evaluating the best approaches and vendors for agentic AI solutions

What you’ll get out of this course

Understand key concepts and patterns underlying modern LLM agents, and how to choose the right approach for your use case


Build portable, reliable tools for your agents and data using Model Context Protocol (MCP)


Implement your own Research agents, incorporating custom format instructions and data access


Learn the fundamentals of Reinforcement Learning (RL) and how it applies to agents


Formulate your agentic tasks as RL problems, with evaluation metrics that enable learning from reward feedback

Use RL algorithms like Group-Relative Policy Optimization (GRPO) to train agents which outperform frontier models on your tasks

A holistic understanding of modern principles and techniques for designing production-ready agents and optimizing them with RL

This course includes

9 interactive live sessions

Lifetime access to course materials

6 in-depth lessons

Direct access to instructor

Projects to apply learnings

Guided feedback & reflection

Private community of peers

Course certificate upon completion

Maven Satisfaction Guarantee

This course is backed by Maven’s guarantee. You can receive a full refund within 14 days after the course ends, provided you meet the completion criteria in our refund policy.

Course syllabus

Week 1

Jun 16—Jun 22

    Agent Patterns and Principles

    • Jun

      17

      Lesson 1

      Tue 6/179:00 PM—10:30 PM (UTC)
    1 more item

    Model Context Protocol: When and Why

    • Jun

      19

      Lesson 2

      Thu 6/199:00 PM—10:30 PM (UTC)
    1 more item

    Jun

    20

    Office Hours (Will Brown)

    Fri 6/207:00 PM—8:00 PM (UTC)

Week 2

Jun 23—Jun 29

    Evals for Agents

    • Jun

      24

      Lesson 3

      Tue 6/249:00 PM—10:30 PM (UTC)
    1 more item

    Reinforcement Learning for Busy Engineers

    • Jun

      26

      Lesson 4

      Thu 6/269:00 PM—10:30 PM (UTC)
    1 more item

    Jun

    27

    Office Hours (Kyle Corbitt)

    Fri 6/277:00 PM—8:00 PM (UTC)

Week 3

Jun 30—Jul 4

    Formulating Business Problems as RL Tasks

    • Jul

      1

      Lesson 5

      Tue 7/19:00 PM—10:30 PM (UTC)
    1 more item

    Jul

    2

    Office Hours (Will Brown)

    Wed 7/27:00 PM—8:00 PM (UTC)

    Training Agents with GRPO

    • Jul

      3

      Lesson 6

      Thu 7/39:00 PM—10:30 PM (UTC)
    1 more item

Meet your instructors

Will Brown

Will Brown

Will is a Research Lead at Prime Intellect, working on advancing the frontier of open-source agentic RL. He was previously a Machine Learning Researcher at Morgan Stanley and an Applied Scientist at AWS, and completed a PhD in Computer Science at Columbia University focused on multi-agent learning.

Kyle Corbitt

Kyle Corbitt

Kyle is the CTO of OpenPipe, the RL post-training company. Through OpenPipe, he has helped dozens of companies of all sizes train custom models optimized for their tasks. He has previous ML experience at Y Combinator and Google.

A pattern of wavy dots

Join an upcoming cohort

Production-Ready Agent Engineering: From MCP to RL

Cohort 1

$1,000

Dates

June 16—July 4, 2025

Payment Deadline

June 15, 2025
Get reimbursed

Course schedule

4-6 hours per week

  • Tuesdays & Thursdays

    5:00pm - 6:30pm EST


  • June 17 - July 3

    2x weekly lectures and at least 1x weekly office hours with instructors

  • Weekly projects

    2 hours per week

    Take-home exercises for more hands-on exposure to the week's topics

Frequently Asked Questions

Stay in the loop

Sign up to be the first to know about course updates.

A pattern of wavy dots

Join an upcoming cohort

Production-Ready Agent Engineering: From MCP to RL

Cohort 1

$1,000

Dates

June 16—July 4, 2025

Payment Deadline

June 15, 2025
Get reimbursed

$1,000

3 Weeks