Generate, Break, Fix: Distributed Systems in the AI Era

Marcin Grzejszczak

Java Champion/OSS/Microservices expert

AI generates code fast. It generates distributed systems failures even faster.

AI is rewriting how backend teams build services. The developers who thrive won't be the ones who generate code fastest - they'll be the ones who catch what AI gets wrong.

Every AI coding agent - Copilot, Cursor, Claude Code -makes the same distributed systems mistakes. Hardcoded service URLs. No circuit breakers. Missing trace propagation. No timeouts.

The gap between "it compiles" and "it survives production" is growing, because AI makes bad code look professional. And no tutorial or YouTube video puts you in front of actual broken infrastructure to practice.

This workshop does. You'll generate real multi-service systems with AI tools, then debug them against real failures - latency, flaky services, cascading outages. Any language, any framework.

You'll diagnose issues, build missing pieces and write AI instruction files that encode your architecture standards. You'll leave with a production-readiness checklist, a battle-tested rules file, and the instinct to spot what AI misses - before your users do.

Please fill out the survey and help me out to prepare the perfect syllabus!

What you’ll learn

Diagnose, fix, and prevent the distributed systems failures that every AI coding agent creates - in any language.

  • Debug real failures, flaky service discovery, and breaking infrastructure

  • Pinpoint root causes across multi-service systems instead of guessing which service broke

  • Practice with the same failure modes that take down production - latency spikes, partitions, pool exhaustion

  • Implement circuit breakers, retries with backoff, bulkheads, and timeouts in your preferred language

  • Validate each pattern by breaking your system again - prove resilience instead of hoping for it

  • Spot the resilience mistakes AI coding agent tend to make and fix them systematically

  • Add distributed traces, structured logging, and custom metrics across service boundaries

  • Use Prometheus, Grafana, and Zipkin to see every request flow through your entire system

  • Fix the trace propagation gaps AI agents consistently miss in async communication

  • Build agent rules that make your AI agent follow team-wide patterns automatically

  • Iteratively refine rules based on real failures - each session makes your AI output better

  • Take home a battle-tested rules file you can apply to any project immediately

  • Write contract tests that catch interface mismatches between services before deployment

  • Run chaos scenarios that expose resilience gaps before your users find them

  • Build a repeatable workflow: generate, test, break, fix, ship

  • Apply universal patterns - service discovery, resilience, observability - regardless of framework

  • Work alongside developers using different stacks and see the same problems appear everywhere

  • Become the go-to engineer who evaluates AI output no matter what language it's written in

Learn directly from Marcin

Marcin Grzejszczak

Marcin Grzejszczak

Java Champion · Creator of Spring Cloud Sleuth & Spring Cloud Contract

~10 years as maintainer of Spring Cloud, Micrometer OSS libraries
Broadcom Inc.
VMware
SpringCentral

Who this course is for

  • Mid-level backend developer adopting AI coding tools who wants to know what to trust and what to fix before shipping to production.

  • Senior engineer or tech lead responsible for system reliability whose team already generates code with AI but ships gaps.

  • Developer working across multiple languages and stacks who wants distributed systems skills that transcend any single framework.

What's included

Marcin Grzejszczak

Live sessions

Learn directly from Marcin Grzejszczak in a real-time, interactive format.

6 live hands-on sessions (2h each)

Every session follows the same loop: generate code with AI, hit real infrastructure problems, diagnose the failure, fix it. 70%+ of the time is you working on real systems - not watching someone else code.

Real broken infrastructure to debug

Injecting latency, flaky services discovery, poisoned database connections, cascading outages. You don't simulate failures - you experience them against real infrastructure the instructor controls.

Any language, any framework, any AI tool

Use whatever stack you work in daily - Java, Python, Go, TypeScript, Kotlin. Bring whichever AI coding agent you prefer - Claude Code, Cursor, Copilot. The distributed systems problems are identical regardless of your choices.

Battle-tested AI rules file you take home

Throughout the course you iteratively build a set of agent rules that encodes architecture standards for resilience, observability, and service communication. By session 6 it's been refined against real failures - ready to use on your next project.

Docker Compose observability stack

Pre-built Docker Compose with Prometheus, Grafana, and Zipkin/Tempo - ready to plug into your generated services. No time wasted configuring infrastructure. You focus on making your services observable, not setting up tooling.

Contract testing templates

Templates for validating service-to-service interactions so AI-generated APIs stay compatible. Catch mismatches between services before deployment - especially critical when different team members (or AI tools) generate different services.

Production-readiness checklist

A concrete rubric covering health checks, graceful shutdown, resource limits, resilience patterns, observability, and security basics. Use it to evaluate any AI-generated distributed system - during the workshop and back at work.

Small cohort for real collaboration

Small group size so you get direct instructor feedback, learn from peers using different stacks, and have real discussions about the failures you encounter. In the final session, teams swap systems and try to break each other's work.

Instructor who built the tools AI gets wrong

Marcin created Spring Cloud Sleuth (now Micrometer Tracing) and Spring Cloud Contract - the observability and contract testing tools AI agents consistently misconfigure. You're learning from the person who knows exactly where AI fails and why.

Reusable generate-break-fix workflow

Walk away with a repeatable process for AI-assisted development: generate with AI, validate with contracts, test with chaos, fix the gaps, refine your rules. A workflow you'll use every time you ship AI-generated code.

Maven Guarantee

This course is backed by the Maven Guarantee. Students are eligible for a full refund up until the halfway point of the course.

Course syllabus

Week 1

Jun 2—Jun 7
    Nothing scheduled for this week

Week 2

Jun 8—Jun 11
    Nothing scheduled for this week

Schedule

Live sessions

6 hrs / week

Projects

6 hrs / week

Async content

1 hr / week

$999

USD

Jun 2Jun 11
Enroll