Apache Spark on Kubernetes: The RIGHT Way (No Master/Worker Clusters Needed) | DailyDevLists

Loading video player...

Apache Spark on Kubernetes: The RIGHT Way (No Master/Worker Clusters Needed)

CodeWithYu

82 days ago

57:54

Kubernetes & Container Orchestration

Rank #1

Description

Run Apache Spark jobs on Kubernetes with ZERO permanent infrastructure! In this comprehensive tutorial, you'll learn how to deploy a production-ready Spark setup that creates pods ONLY when jobs run and automatically cleans up when done. Say goodbye to costly always-on Spark clusters. 🎯 WHAT YOU'LL LEARN ✅ Deploy Spark on Kubernetes WITHOUT permanent master/worker nodes ✅ Build custom Spark images with embedded PySpark jobs ✅ Submit jobs that auto-scale executors and self-cleanup ✅ Run real-world analytics: customer segmentation, cohort analysis, revenue trends ✅ Set up the Spark History Server for job monitoring ✅ Implement proper RBAC security for production ✅ Debug and monitor jobs using kubectl and Spark UI TIMESTAMPS 0:00 Introduction 1:37 System Architecture 5:48 Setting up K8S 8:10 Setting up the project 10:00 K8S Namespaces 11:45 K8s Service Accounts, RBAC 17:27 Creating Spark Jobs for K8S 26:40 k8s Spark History Server 34:24 Spark Control Dashboard 42:42 k8s API layer 49:52 Spark Dashboard, Job submissions and review 56:52 Outro 🔗 RESOURCES & LINKS FULL SOURCE CODE - https://buymeacoffee.com/yusuf.ganiyu/source-code-spark-k8s • Apache Spark K8s Docs: https://spark.apache.org/docs/latest/running-on-kubernetes.html • Kubernetes Documentation: https://kubernetes.io/docs/ • PySpark API Reference: https://spark.apache.org/docs/latest/api/python/ Like this video? Support us: https://www.youtube.com/@CodeWithYu/join #ApacheSpark #Kubernetes #DataEngineering #BigData #PySpark #DevOps #CloudNative #K8s #DataPipelines #ETL #Tutorial

Watch on YouTube

Video Details

Category

Kubernetes & Container Orchestration

Featured Date

Quality Rank

#1

AI Recommended