DeepSeek-R1 on AWS Kubernetes Service EKS with Ray and vLLM
Hosted by Aymen Segni
What you'll learn
DeepSeek-R1 Deployment on Amazon Elastic Kubernetes Service
Master the deployment of large GenAI models on AWS EKS by leveraging Ray and vLLM
GPU Optimization and Resource Management
Gain deep insights into optimizing GPU resources for high-performance critical for handling 8B parameter model
End-to-End Integration & Monitoring
Integrate AI endpoints into Kubernetes, monitor real-time performance, and ensure high availability.
Use the DevOps and Cloud Native standard tools with GenAI
Acquire hands-on experience with Terraform, Kubernetes, Karpenter, and Grafana to build prod-grade GenAI Platforms
Understand the Benefits of Self-Hosting your GenAI models
Understand how self-hosting GenAI models is cost-effective and ensures data security, Dev freedom, and top performance
Why this topic matters
This course meets the need for scalable, enterprise-grade AI. As generative AI evolves, organizations require models that are powerful, resource-efficient, and resilient. Master AWS EKS, Ray, and vLLM to build applications that deliver real-time insights, drive innovation, and create transformative business impact.
You'll learn from
Aymen Segni
Staff Engineer, SRE, DevOps and Platform Engineering
I’m a Cloud & DevOps Leader with deep expertise in platform engineering, cloud-native architectures, MLOPS Platforms, and DevOps best practices. Over the years, I’ve helped companies—from startups to enterprises—scale their infrastructure, improve reliability, and optimize performance.
Keep exploring