LLM Inference: The Skill Every AI Engineer Is Missing
Hosted by Dr. Raj Dandekar
In this video
What you'll learn
The most common interview question asked at Apple
"Design a low-latency, high-throughput LLM inference system handling millions of requests"
The beauty of inference engineering
A birds eye view of the full LLM inference stack: from tokenization to autoregressive decoding: and why most AI teams
The world's 1st Inference Engineering Bootcamp
14 live lectures, 9 guest speakers from Apple, NVIDIA, Microsoft, 4 hardware labs and lot more!
Why this topic matters
Every AI team is building with LLMs. Almost none know how to serve them efficiently.
The gap between "it works in a notebook" and "it serves 10,000 users at low latency" is inference engineering: and it's the most in-demand, least-taught skill in AI today.
You'll learn from
Dr. Raj Dandekar
CTO and Co-founder Vizuara AI, MIT PhD
Dr. Raj Dandekar holds a PhD from MIT and has taught inference engineering to 500+ engineers from Google, Microsoft, Amazon, NVIDIA, and Anthropic. He has 700+ citations from 20+ research publications and is currently the CTO and Co-founder of Vizuara AI Labs, one of the leading AI companies in the world.
MIT, IIT Madras, Vizuara AI Labs
Go deeper with a course
LLM Inference Engineering: Theory, Practicals and Research


Dr. Raj Dandekar and Yash Dixit
MIT PhD | CTO at Vizuara AI Labs | Researcher. Apple AI/ML | MIT | Ex-McKinsey | Research mentor for publication-track students
Keep exploring





