LLM Inference: The Skill Every AI Engineer Is Missing

Hosted by Dr. Raj Dandekar

641 students

In this video

What you'll learn

The most common interview question asked at Apple

"Design a low-latency, high-throughput LLM inference system handling millions of requests"

The beauty of inference engineering

A birds eye view of the full LLM inference stack: from tokenization to autoregressive decoding: and why most AI teams

The world's 1st Inference Engineering Bootcamp

14 live lectures, 9 guest speakers from Apple, NVIDIA, Microsoft, 4 hardware labs and lot more!

Why this topic matters

Every AI team is building with LLMs. Almost none know how to serve them efficiently. The gap between "it works in a notebook" and "it serves 10,000 users at low latency" is inference engineering: and it's the most in-demand, least-taught skill in AI today.

You'll learn from

Dr. Raj Dandekar

CTO and Co-founder Vizuara AI, MIT PhD

Dr. Raj Dandekar holds a PhD from MIT and has taught inference engineering to 500+ engineers from Google, Microsoft, Amazon, NVIDIA, and Anthropic. He has 700+ citations from 20+ research publications and is currently the CTO and Co-founder of Vizuara AI Labs, one of the leading AI companies in the world.

MIT, IIT Madras, Vizuara AI Labs

Massachusetts Institute of Technology
IIT Madras
Vizuara

Go deeper with a course

LLM Inference Engineering: Theory, Practicals and Research
Dr. Raj Dandekar and Yash Dixit
View syllabus