Loading video player...
In this series, Jules Damji provides a technical walkthrough of the AI agent development lifecycle using MLflow. We move beyond basic prompting to cover the rigorous engineering required for production-grade agents, including instrumentation for observability, automated evaluation patterns like LLM-as-a-Judge, and systemic prompt optimization. These tutorials provide a roadmap from initial environment configuration to the deployment of a fully instrumented Retrieval-Augmented Generation (RAG) system. Follow the series chronologically or jump to specific modules to address architectural and knowledge gaps. Tutorial Roadmap š¹ Module 1 & 2: Infrastructure & Tracking ā Environment setup, credential management, and hierarchical run comparison (Parent/Child). š¹ Module 3 & 4: AI Observability & Tracing ā Implementing auto-tracing and manual decorators to monitor tool calls and latencies, and using MLflow Assistant for debugging and analysing root cause analysis š¹ Module 5: Prompt Engineering ā Versioning prompts via the Prompt Registry and optimization using the GAPA algorithm. š¹ Module 6 & 7: Agent Evaluation & Integrated Frameworks ā Scaling evaluation with the "LLM-as-a-Judge" pattern and integrating with LangChain/LlamaIndex. Final Project: Building and deploying an end-to-end instrumented RAG application.