DeepSeek-R1 on AWS Kubernetes Service EKS with Ray and vLLM

Hosted by Aymen Segni

79 students

What you'll learn

DeepSeek-R1 Deployment on Amazon Elastic Kubernetes Service

Master the deployment of large GenAI models on AWS EKS by leveraging Ray and vLLM

GPU Optimization and Resource Management

Gain deep insights into optimizing GPU resources for high-performance critical for handling 8B parameter model

End-to-End Integration & Monitoring

Integrate AI endpoints into Kubernetes, monitor real-time performance, and ensure high availability.

Use the DevOps and Cloud Native standard tools with GenAI

Acquire hands-on experience with Terraform, Kubernetes, Karpenter, and Grafana to build prod-grade GenAI Platforms

Understand the Benefits of Self-Hosting your GenAI models

Understand how self-hosting GenAI models is cost-effective and ensures data security, Dev freedom, and top performance

Why this topic matters

This course meets the need for scalable, enterprise-grade AI. As generative AI evolves, organizations require models that are powerful, resource-efficient, and resilient. Master AWS EKS, Ray, and vLLM to build applications that deliver real-time insights, drive innovation, and create transformative business impact.

You'll learn from

Aymen Segni

Staff Engineer, SRE, DevOps and Platform Engineering

I’m a Cloud & DevOps Leader with deep expertise in platform engineering, cloud-native architectures, MLOPS Platforms, and DevOps best practices. Over the years, I’ve helped companies—from startups to enterprises—scale their infrastructure, improve reliability, and optimize performance.