Loading video player...
In this video, we build an AI-powered Kubernetes agent that detects and fixes high latency automatically using Prometheus, Grafana, and Llama3. Instead of relying on CPU-based scaling, this agent focuses on real user experience by analyzing p95 latency metrics. 💡 What you’ll learn: • How to monitor latency using Prometheus • How to visualize metrics in Grafana • How to build an AI agent inside Kubernetes • How to use Llama3 (via Ollama) for intelligent decision making • How to automatically scale deployments based on latency 🚀 Demo includes: • Simulating real-world latency using load generation • Observing latency spike in Grafana • AI agent detecting high latency • Automatic scaling of Kubernetes pods • Real-time recovery and stabilization 🛠 Tech Stack: • Kubernetes (Minikube) • Prometheus + Grafana • Python (Kubernetes client) • Ollama (Local LLM) • Llama3 🎯 Key Insight: CPU does not always reflect user experience — latency does. AI can make smarter decisions by focusing on real-world signals. 👉 GitHub Repo: https://github.com/happy-jitesh/k8s-latency-agent 👍 Like, Share & Subscribe for more Agentic AI for DevOps content! #Kubernetes #DevOps #Prometheus #Grafana #AgenticAI #Llama3 #AIOps