Loading video player...
Submission for the AI Partner Catalyst Hackathon (Datadog Track) The Problem: The fear of GenAI isn't that it won't work... it's that it will work, and you won't know when it fails. The Solution: Autonomic AI. This project turns the "Black Box" of GenAI into a transparent "Glass Box" by creating an event-driven, self-healing agent mesh. In this video, we demonstrate how we use a swarm of backend agents (Auditor, Refiner, Evaluator) to automatically detect business logic failures in a user-facing chatbot and—crucially—deploy a fix without human intervention, all while maintaining complete observability through Datadog. 🛠️ How It Works (The Self-Healing Loop) Gateway Agent: Interacts with users (e.g., Car Salesman Persona). Auditor (The Judge): asynchronous scoring of conversations via Pub/Sub. Refiner (The Fixer): rewrites system prompts using Gemini 2.5 Flash if mistakes are found. Evaluator (The Tester): sandboxes the new prompt to ensure the fix works. Datadog Ops Center: visualizes the entire "thought process," costs, and optimization rates. 📊 Datadog Integration We built a custom "Autonomic AI Ops Center" Dashboard to track: Optimization Rate: How often the system self-heals. Budget Breach: Alerts for cost spikes (more than $0.10/msg). Log Streams: Real-time debugging filtered by service:auditor & service:refiner. Vital Signs: User-facing vs. Back-end latency tracking. 🏗️ Tech Stack Observability: Datadog (Logs, Dashboards, APM) LLM: Google Vertex AI (Gemini 2.5 Flash) Infrastructure: Google Cloud Platform (Pub/Sub, Cloud Run, Firestore) Backend: Python FastAPI #Datadog #VertexAI #GoogleCloud #Hackathon #GenAI #LLMOps #AIAgents #SelfHealingAI #DevOps #Observability