Loading video player...
Building reliable LLM apps is hard. You fix a prompt for one case and break it for another. Today we're launching a completely redesigned evaluation workflow to help you iterate faster and catch regressions. What's new: → Redesigned evaluation dashboard with clear metrics overview → Detailed test case view with full traces for debugging → Side-by-side comparison to spot regressions → Flexible LLM-as-a-judge with custom schemas Teams in beta are running 2x more evaluations and shipping faster. 🔗 Try it now: https://cloud.agenta.ai This is Day 1 of Agenta Launch Week. Subscribe to see what's coming next. -- About Agenta: Agenta is an open-source LLMOps platform for building production-ready LLM applications. We help teams evaluate, version, and deploy prompts and workflows with confidence. ⭐ Star us on GitHub: https://github.com/agenta-ai/agenta #LLMOps #AI #MachineLearning #Evaluations