Building Multi-Agent Forecasting Systems

Stefan Jansen

Author, ML for Trading · Applied AI

Build a multi-agent forecasting system you can audit — in one day.

The most useful thing you can do with agents right now isn’t a chatbot — it’s a multi-agent research pipeline that produces a calibrated answer and shows its work. Bridgewater’s AIA Forecaster is the clearest published blueprint for that pattern. In one day you’ll stand up an open, paper-faithful replication, run it on real prediction-market questions (read-only), and learn the parts that generalize to any agentic research system: parallel search agents, a supervisor that reconciles them, honest calibration, and the production scaffolding — traces, reproducibility, evaluation — that makes the whole thing trustworthy.

You leave with a working repo you understand and can modify, not slides.

What you’ll learn

Build, evaluate, and operate a paper-faithful multi-agent forecasting pipeline — and know what it tells you.

  • Walk through the paper-to-implementation map, then clone the repo, run a deterministic replay, and read the output before lunch.

  • Stand up each pipeline stage in your own session and inspect every step — search queries, findings, supervisor moves — in the trace UI.

  • Modify a config profile and rerun the same questions to feel how aggregation choices change the answer.

  • Generate calibration curves and Brier scores on a real set of resolved questions and read what they say honestly.

  • Run side-by-side ablations across config profiles and see which components actually move accuracy — and which sound smart but don’t.

  • Take the calibration habit back to whatever model or agent system you work on next.

  • Tour the per-forecast trace drill-down and reproducible config snapshots — the scaffolding that makes the system inspectable later.

  • Wire a read-only market connector (Polymarket, Kalshi) and run a daily cycle end-to-end — pull → forecast → track — on your own machine.

  • Leave with a repo you understand, a CLI you can extend, and a working pattern you can transplant into your own domain.

Workshop agenda

  • [Lecture] The AIA Forecaster blueprint

    What Bridgewater built, why parallel agents + a supervisor + calibration beat a single model, and where the open implementation diverges from the paper.

  • [Hands-on] Setup & first forecast

    Clone the repo, uv sync, and run a deterministic replay — no keys, no network. Read the output and locate where each stage stores its trace.

  • [Lecture] Stages 1–2: question rewriting & research agents

    How the system reformulates an ambiguous question into precise sub-questions, then dispatches parallel agents to gather and structure evidence.

  • [Hands-on] Run the research agents

    Run a local Ollama pass on real prediction-market questions, then open the dashboard to inspect each agent’s queries, findings, and stated uncertainties.

  • [Lecture] Stages 3–4: aggregation & the supervisor

    Neyman extremize vs mean/median aggregation, and how the supervisor’s clarifying searches reconcile disagreement when agents diverge.

  • [Hands-on] Ablation lab

    Swap aggregation methods and toggle the supervisor via config profiles. Run the same markets across profiles and compare outcomes side by side.

  • [Lecture] Calibration & evaluation

    Platt scaling, Brier scores, and calibration curves. Why a confident-but-wrong agent is worse than a humble one — and what the curve actually tells you.

  • [Hands-on] Calibrate & score

    Generate a calibration curve on the workshop dataset, read it honestly, and decide which profile from the ablation lab is the one you would ship.

  • [Lecture] Productionizing & where this generalizes

    Daily runner, config snapshots, read-only connectors (Polymarket, Kalshi, Manifold, Metaculus). What changes live — and what to lift into your own agent work.

  • [Hands-on] End-to-end daily cycle

    Wire a read-only connector and run the full pull → forecast → track loop. Leave with the exact command you would cron tomorrow.

Learn directly from Stefan

Stefan Jansen

Stefan Jansen

Author, ML for Trading (Ch. 24: agents) · Founder, Applied AI

See all products from Stefan

Who this workshop is for

  • ML/AI engineers shipping multi-agent or agentic-research systems who want a real, evaluated reference to study and modify.

  • Quant researchers and data scientists weighing whether agentic forecasting belongs in your stack — and what to test before it does.

  • Technical PMs and leads judging agent systems on traces, calibration, and ablations rather than vibes.

What's included

Stefan Jansen

Live sessions

Learn directly from Stefan Jansen in a real-time, interactive format.

6 hours live, hands-on with Stefan

Real-time guidance through every stage of the AIA Forecaster pipeline — lecture, lab, debug. You build with the author of Machine Learning for Trading alongside, not by watching slides.

The full aia-forecaster repo

A working multi-agent system: CLI, dashboard, configuration profiles, read-only Polymarket / Kalshi / Manifold / Metaculus connectors, and 95+ tests. Cloneable today, shippable tomorrow.

11 chapter-companion notebooks

Each workshop block maps to one notebook from Chapter 24 of Machine Learning for Trading (3rd ed., 2026). Run the whole stack offline via deterministic replay — no keys, no network required.

Maven Guarantee

Your purchase is backed by the Maven Guarantee.

Frequently asked questions

Maven for Teams

Reimbursement

Get your company to pay

Everything L&D needs: email template, receipts, and certificate of completion.

Get reimbursed

Team discount

Learn with your teammates

Save 20%+ when 2 or more teammates enroll in the same cohort.

Save 20%+ with a team

Private cohort

Run a cohort for your org

A dedicated cohort with a custom schedule and curriculum, tailored to your team.

Book a private cohort

$650

USD

Jun 27
Enroll