Loading video player...
In this video, I build a complete local AI observability pipeline using Ollama, MCP-style tool calling, Docker, and Prometheus. Run a lightweight LLM locally (Qwen 2.5 3B) Enable tool/function calling Connect the model to real application metrics Let the LLM fetch P95 latency dynamically Generate actionable production insights Visualize metrics via Prometheus Keep everything fully offline This is a real-world pattern for: AI-assisted SRE workflows Autonomous incident triage Intelligent production monitoring Tool-augmented local agents Stack used: Docker Compose Local LLM via Ollama MCP-style function calling Prometheus metrics endpoint Python orchestration loop If you’re building AI systems for DevOps, observability, or production environments — this architecture matters. #ai #architecture #softwaredevelopment #aidevelopment