LLM Inference: The Skill Every AI Engineer Is Missing

Hosted by Dr. Raj Dandekar

649 students

In this video

What you'll learn

The most common interview question asked at Apple

"Design a low-latency, high-throughput LLM inference system handling millions of requests"

The beauty of inference engineering

A birds eye view of the full LLM inference stack: from tokenization to autoregressive decoding: and why most AI teams

The world's 1st Inference Engineering Bootcamp

14 live lectures, 9 guest speakers from Apple, NVIDIA, Microsoft, 4 hardware labs and lot more!

Why this topic matters

Every AI team is building with LLMs. Almost none know how to serve them efficiently. The gap between "it works in a notebook" and "it serves 10,000 users at low latency" is inference engineering: and it's the most in-demand, least-taught skill in AI today.

You'll learn from

Dr. Raj Dandekar

CTO and Co-founder Vizuara AI, MIT PhD

Dr. Raj Dandekar holds a PhD from MIT and has taught inference engineering to 500+ engineers from Google, Microsoft, Amazon, NVIDIA, and Anthropic. He has 700+ citations from 20+ research publications and is currently the CTO and Co-founder of Vizuara AI Labs, one of the leading AI companies in the world.

See all products from VizuaraAI

MIT, IIT Madras, Vizuara AI Labs

Massachusetts Institute of Technology
IIT Madras
Vizuara

Go deeper with a course

LLM Inference Engineering: Theory, Practicals and Research
Dr. Raj Dandekar and Yash Dixit
View syllabus