5 Weeks
·Cohort-based Course
Voice AI is about to have it's own ChatGPT moment. Learn how to build applications that listen and act.
5 Weeks
·Cohort-based Course
Voice AI is about to have it's own ChatGPT moment. Learn how to build applications that listen and act.
Course overview
Latency budgets, desktop permissions, SIP rules, plus the AI minefield—hallucinations, accent-biased transcriptions, token-burst rate limits, and model drift—turn “just add voice” into months of painful debugging even for senior engineers. WebRTC jitter ruins timing, loose prompts spark off-brand replies, and a single bad transcript cascades into wrong actions and angry users.
Over four weeks you’ll build three voice products—desktop, browser, and telephony—side-by-side with the maintainers of Vapi, Pipecat, LiveKit, and Whisper, plus Ivan and Nicolay, who’ve shipped these systems in production. You’ll master the shared scaffolding (stream segmentation, prompt stitching, cost/latency meters) and the AI guardrails (real-time validation, confidence scoring, speculative decoding) that keep voice assistants responsive, factual, and customer-safe across every channel and model.
But don't just trust us that Voice AI is here to stay and having it's "ChatGPT" moment.
"Humans interact with businesses in many ways, but one way hasn't changed much in almost 100 years—and that's phone calls.Today, over a trillion calls exist between a business and a customer....new voice models and conversational LLMs are now incredibly good ... startups are ... making voice AI bots that are indistinguishable from humans." - Gustaf Alströmer, YC -- in a call for Voice AI startups
"For enterprises, AI directly replaces human labor with technology. It’s cheaper, faster, more reliable — and often outperforms humans. Voice agents also allow businesses to be available to their customers 24/7 to answer questions, schedule appointments, or complete purchases...For consumers, we believe voice will be the first — and perhaps the primary — way people interact with AI." - Olivia Moore, a16z -- AI voice in consumer
There are already meeting bots that talk to you over Zoom, language coaches that help you learn Spanish in a webapp, ambient assistants that sit on your laptop, listen, and help you out when you need some input. "Voice UX" will keep growing in surface: We will move into cars, AR glasses, smart speakers, and other areas we haven't even considered yet.
Each channel breaks in its own way—latency on the web, permissions on desktop, SIP rules on telephony. One demo can’t teach you all of that.
That's why we build three separate voice AI applications in this course. A webapp, a native (MacOS) app, and a telephony app.
1. Native macOS Meeting Assistant – records your mic locally, takes live notes, pushes tasks to Notion, and pings you in Slack before deadlines. Learn how the most successful app to date (Granola) does it. Manual note-taking in back-to-back calls burns 6 h/week and important information still slips.
2. Web-based AI Sales Coach – simulates tough customers, scores every response, and shows real-time coaching tips without breaking flow. Learn how to live update UX based on an ongoing conversation. New reps take 6 months to hit quota; live coaching is expensive.
3. Telephony Booking Bot – calls clients, confirms appointments, handles DTMF/silence, and writes results straight into your CRM. Learn how to reliably call and handle diverse accents. Staff spend hours calling clients; no-show rate ~45 %.
Why you care as a student
- These metrics resonate with CTOs, PMs, and investors—your demo isn’t a toy.
- Each channel teaches a different “gotcha”: OS sandbox, browser jitter, telephony regs. Master once, reuse forever.
- Portfolio proof: three repos that shout “I can ship voice products anywhere users speak.”
After that, the next interface is just more plumbing.
The tools you’ll learn
Vapi – voice routing without IVR hell
Pipecat – low-latency audio transforms
LiveKit – WebRTC that survives bad networks
Whisper/Elevanlabs/AssemblyAI – fast, accurate transcription
OpenAI & Gemini Realtime – millisecond-level reasoning
Hands-on workshops with the engineers who wrote these libraries
Exclusive Access: Connect directly with the engineers building these tools through dedicated workshops and Q&A sessions. Learn from those who know these technologies best.
- LiveKit founder workshop: learn about WebRTC and how to make networking a breeze.
- OpenAI real-time API creators: learn how to best prompt real-time model.
- more workshops from speakers from ElevenLabs, AssemblyAI, Vapi, Pipecat will be added soon.
Prerequisites (read this)
- Comfortable in TypeScript/JavaScript (async/await, streams, React or similar)
- Basic REST & WebSocket chops
- Familiarity with Git and command-line tooling
If you’ve never shipped production code, this bootcamp will overwhelm you.
01
The “Build-It-Now” CTO racing to add voice; needs production-ready blueprints, cost controls, and multi-channel code now.
02
Software & AI Engineer – General software & AI engineers exploring voice; seek hands-on repos to learn streaming audio, LLM prompts,...
Ship 3 real voice products in 4 weeks
By Demo Day you’ll have a macOS meeting assistant, a WebRTC sales-coach webapp, and a Twilio/Vapi booking bot running on your own account—ready to show a boss, investor, or client.
Save 24+ engineering hours on “figuring it out”
We hand you working repos, infra scripts, and latency / cost benchmarks. Ship voice features 2–3 × faster than starting cold.
Quantifiable business impact you can brag about
Hands-on with the maintainers
Live coding + AMA sessions with:
Plug-and-play test & guardrail suite
Automated latency alerts, hallucination detectors, and ASR-accuracy checks you can drop straight into any future voice project—so bugs surface in CI, not in prod.
Voice-AI Tool Selection Playbook
Download-ready spreadsheet + benchmarks scripts that score every major ASR (Whisper, AssemblyAI, Deepgram), TTS (ElevenLabs, Polly), routing layer (Vapi, Twilio), and realtime LLM (OpenAI, Gemini) on latency, cost, language coverage, and hallucination rate. Run npm run bench.
Private Discord “War Room” for Real-Time Help
Get into a members-only Slack with maintainers (Ivan, Nicolay) and other builders. Dedicated channels for #latency-bugs, #prompt-design, and #show-your-metrics guarantee you can paste logs, share PRs, book 15-min pairing slots, and get answers during the course.
15 interactive live sessions
Lifetime access to course materials
27 in-depth lessons
Direct access to instructor
5 projects to apply learnings
Guided feedback & reflection
Private community of peers
Course certificate upon completion
Maven Satisfaction Guarantee
This course is backed by Maven’s guarantee. You can receive a full refund within 14 days after the course ends, provided you meet the completion criteria in our refund policy.
Build Voice AI Applications That Listen and Act in Real-Time
Sep
2
Kick-off Code Walkthrough (60 min)
Sep
3
Expert Talk – Streaming with OpenAI /Gemini APIs (60 min)
Sep
5
Help Line (60 min)
Sep
8
Kick-off Code Walkthrough (60 min)
Sep
9
Expert Talk: Prompt Engineering for Voice AI
Sep
12
Help Line (60 min)
Sep
15
Kick-off Code Walkthrough (60 min)
Sep
16
Expert Talk: Evaluating Voice Assistants
Sep
19
Help Line
Sep
18
Extra Panel: Pipecat vs. Daily vs. LiveKit
Sep
22
Kick-off Code Walkthrough (60 min)
Sep
23
Expert Talk: How to do telephony with voice AI at scale
Sep
25
Workshop: Telephony Flow & SIP Quirks
Sep
26
Help Line
CTO, Managing Partner
Nicolay has been working on LLMs since 2019 and is the founder of Aisbach, where he specialized on generative AI systems.
Research Engineer
Ivan is a full-stack engineer turned research engineer. He brings academic breakthroughs into industry. Ivan maintains open source libraries like Instructor, indomee and Kura.
Join an upcoming cohort
Cohort 1
$1,449
Dates
Payment Deadline
8-10 hours per week
Live Sessions
Tuesdays 1pm/5pm CET
Interactive workshops and implementation guidance
We will determine the best time based on the students and their time zones.
Office Hours
Friday 1pm/pm CET
Get 1:1 help with your prototypes and code before you dive into the weekend.
We expect that you start to get into the code after the live session and then already have questions before the weekend.
Weekly projects
5 hours per week
Your weekly implementations. These can be adjustments to existing code or complete implementations of real-time voice applications.
You can bring your own idea or build upon our existing applications. We are happy to help you with both, but recommend the later.
Sign up to be the first to know about course updates.
Join an upcoming cohort
Cohort 1
$1,449
Dates
Payment Deadline