Voice AI and Voice Agents: a Technical Deep Dive

New
·

5 Weeks

·

Cohort-based Course

Learn about core technologies and best practices for realtime, conversational AI. Explore models and tools with hands-on help from experts.

This course is popular

50+ people enrolled last week.

Hosted by

Kwindla Hultman Kramer and swyx

Brought to you by the architect of Pipecat and the inventor of "AI Engineer."

Course overview

Understand the core technologies driving voice AI today

📢 🛜 🎉 Credits from partner companies for all students in the class 🎉 🛜 📢


➡️ $500 from Modal (https://modal.com/)



Four weeks of live presentations from industry leaders via Zoom, paired with hands-on, interactive office hours sessions.


Hear from people training models, designing APIs, and shipping AI agents at scale in production. Learn about the current state-of-the-art in voice AI for use cases like:


➡️ Customer support agents

➡️ Outbound agents that quality sales leads

➡️ Agents that answer the phone for restaurants, healthcare providers, and other small businesses

➡️ Games and social experiences


1. Core technology and tooling for voice agents


➡️ Using transcription models, language models, and voice models

➡️ Minimizing latency

➡️ Managing multimodal context

➡️ Turn detection and interruption handling

➡️ Audio codecs, telephony integration, and network transport

➡️ Voice AI evals

➡️ Scripting and workflows

➡️ Content guardrails

➡️ Retrieval augmented generation and conversation memory

➡️ Function calling for realtime AI, including how to handle async, parallel, and composite function calls


2. Deploying agents to production


➡️ Hosting

➡️ Monitoring and observability


3. Plus special sections on


➡️ Realtime video

➡️ Voice-enabled programming environments

Who is this course for

01

You are an AI Engineer working on a conversational voice project.

02

You are a technical manager adding AI Voice to your organization's products.

03

You develop models and API and are exploring how to expand your realtime capabilities.

What you’ll get out of this course

Understand the landscape of models and tools people are using to deploy voice agents to production, today.


Hear from labs developing the next generation of transcription, LLM, voice, and speech-to-speech models.


Build and deploy your own voice AI agent.


Attend hands-on workshops with developer relations engineers from the leading labs and infrastructure providers.


Explore the emerging areas of realtime conversational video and voice-enabled programming tools.

This course includes

Interactive live sessions

Lifetime access to course materials

In-depth lessons

Direct access to instructor

Projects to apply learnings

Guided feedback & reflection

Private community of peers

Course certificate upon completion

Maven Satisfaction Guarantee

This course is backed by Maven’s guarantee. You can receive a full refund within 14 days after the course ends, provided you meet the completion criteria in our refund policy.

Course syllabus

Week 1

Apr 30—May 4

    An overview of the voice AI landscape

    0 items

Week 2

May 5—May 11

    Building voice agents for production

    0 items

Week 3

May 12—May 18

    Deep dive into models — transcription, LLM, voice generation

    0 items

Week 4

May 19—May 25

    Up next — video and speech-to-speech models

    0 items

Week 5

May 26—May 30

    Voice-enabled programming environments

    0 items
A pattern of wavy dots

Join an upcoming cohort

Voice AI and Voice Agents: a Technical Deep Dive

Cohort 1

$500

Dates

Apr 30—May 30, 2025

Payment Deadline

Apr 29, 2025

Don't miss out! Enrollment closes in 8 days

Get reimbursed

Course schedule

1 - 4 hours per week

  • Live Presentations

    12:00pm - 1:00pm EST

    Live presentations from people training models, developing APIs and tools, and shipping voice agents to production.

  • Office Hours

    Hands-on sessions and Q&A opportunities with technical engineers from leading labs, infrastructure companies, and startups making interesting new tools.

A pattern of wavy dots

Join an upcoming cohort

Voice AI and Voice Agents: a Technical Deep Dive

Cohort 1

$500

Dates

Apr 30—May 30, 2025

Payment Deadline

Apr 29, 2025

Don't miss out! Enrollment closes in 8 days

Get reimbursed

$500

8 days left to enroll