Loading video player...
Join Marc, Co-Founder and CEO of Langfuse, and Aris, Specialist Solutions Architect from AWS, as they dive into the continuous evaluation, monitoring, and operations of AI agents. This session focuses on Amazon Bedrock Agent Core and its integration with Langfuse, featuring example code and detailed technical insights. They discuss the evolution from DevOps to AgentOps, practical implementation guides, and demonstrate key concepts including infrastructure setup, tooling, and running evaluations. Links Example Repository used throughout this video: https://github.com/aristsakpinis93/agentcore-langfuse-continous-eval-loop Langfuse AgentCore integration doc: https://langfuse.com/integrations/frameworks/amazon-agentcore Slides shown in video: https://static.langfuse.com/events/2025_10_continuous_agent_evaluation_with_amazon_bedrock_agentcore_and_langfuse.pdf Chapters 00:00 Introduction and Session Overview 01:33 Evolution from DevOps to AgentOps 02:29 Amazon's Journey to Microservices 03:40 Challenges with Mono-Agents 06:36 Implementing AgentOps 20:24 Introduction to Langfuse 23:29 Langfuse Architecture and Integration 28:39 Connecting AgentCore and Langfuse 33:29 Common Issues in Agent Systems 35:40 Evaluation Methods for Agents 45:31 Setting Up Test Environments 47:44 Sequential Process Overview 48:02 Development and Experimentation Phase 48:29 QA and Testing Phase 48:39 Production Operations 50:11 Multi-Environment Setup 52:58 Cloud-Based Implementation 53:06 Experimentation and HPO Phase 54:12 QA Testing with CICD 55:18 Production Operations and Monitoring 55:38 Langfuse Integration 56:53 Experimentation and Evaluation 58:07 Using Langfuse UI 58:11 Configuring Evaluators 58:30 Running Evaluations 01:01:23 Hyperparameter Optimization 01:04:43 CI/CD Pipeline Implementation 01:06:38 Local Evaluations in CI/CD 01:10:45 Deploying Agents to Production 01:12:55 Production Operations and Monitoring 01:18:47 Human Annotation and Feedback Loop 01:20:38 Conclusion and Resources