Loading video player...
Learn step by step how to deploy machine learning models on Kubernetes — from building a FastAPI service and packaging it with Docker to deploying it on a Kubernetes cluster and scaling it with Horizontal Pod Autoscaling (HPA). This workshop is part of the Machine Learning Zoomcamp, a free course on machine learning engineering and MLOps. You’ll learn practical Kubernetes deployment workflows used by ML and DevOps teams in production. What you’ll learn ✅ How Kubernetes works for ML model deployment ✅ Setting up a local Kubernetes cluster with Kind (Kubernetes in Docker) ✅ Building and serving a FastAPI app for ML inference ✅ Creating and managing Kubernetes deployments and services ✅ Packaging your model in Docker for containerized deployment ✅ Adding health checks and horizontal pod autoscaling (HPA) ✅ Best practices for scalable and reliable ML infrastructure Whether you're a data scientist, ML engineer, or DevOps learner, this hands-on Kubernetes tutorial will teach you how to move models from notebooks to production-ready environments. 🔗 Resources - 💻 Code for this workshop: https://github.com/alexeygrigorev/workshops/tree/main/mlzoomcamp-k8s - 📘 Join the free ML Zoomcamp course: https://github.com/DataTalksClub/machine-learning-zoomcamp 🧠 Tools & Technologies - FastAPI - Docker - Kubernetes (K8s) - Kind (Kubernetes in Docker) - ONNX Runtime - PyTorch ⏱️ Chapters * 0:00 Intro and course context - 5:07 Start of workshop: Environment — GitHub Codespaces - 6:00 Required tools — Docker, Kind, kubectl - 7:12 Local cluster setup — Kind (Kubernetes in Docker) - 7:37 Service goal — FastAPI for clothing classifier model - 9:13 Why Kubernetes — industry standard for ML deployment * 10:28 Dockerizing the app and local run * 46:08 Kubernetes concepts — Pods Deployments Services * 50:34 Deployment YAML — replicas image container port * 54:48 Readiness and liveness probes — /health * 56:41 Creating Kind cluster * 58:46 Loading local image into Kind * 59:17 Applying deployment with kubectl * 1:01:30 Creating Service and load balancing * 1:05:48 Port-forward for local access * 1:08:36 Installing Metrics Server * 1:09:55 HPA configuration — min 2 max 5 target 50% CPU * 1:12:03 Load test initiation * 1:13:48 Autoscaling observed — 2 to 4 replicas * 1:24:02 Wrap-up Connect with DataTalks.Club: - Join the community - https://datatalks.club/slack.html - Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ - Check other upcoming events - https://lu.ma/dtc-events - GitHub: https://github.com/DataTalksClub - LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/ Connect with Alexey - Twitter - https://twitter.com/Al_Grigor - Linkedin - https://www.linkedin.com/in/agrigorev/ Check our free online courses: - ML Engineering course - http://mlzoomcamp.com - Data Engineering course - https://github.com/DataTalksClub/data-engineering-zoomcamp - MLOps course - https://github.com/DataTalksClub/mlops-zoomcamp - LLM course - https://github.com/DataTalksClub/llm-zoomcamp - Open-source LLM course: https://github.com/DataTalksClub/open-source-llm-zoomcamp - AI Dev Tools course: https://github.com/DataTalksClub/ai-dev-tools-zoomcamp 👉🏼 Read about all our courses in one place - https://datatalks.club/blog/guide-to-free-online-courses-at-datatalks-club.html 👋🏼 Support/inquiries If you want to support our community, use this link - https://github.com/sponsors/alexeygrigorev If you’re a company, reach us at alexey@datatalks.club #MachineLearning #Kubernetes #MLOps #MLZoomcamp #FastAPI #Docker #ONNX #PyTorch #MachineLearningEngineering