Loading video player...
In this step-by-step tutorial, you’ll discover how to scale your AI agent evaluation workflows with NVIDIA NeMo Evaluator LLM-as-a-Judge. This video walks you through how to: ✅ Install NVIDIA NIM Operator and setting up Prometheus ✅ Set up LLM-as-a-Judge with NeMo Evaluator using Docker compose ✅ Configure evaluation config with Llama Nemotron Nano 1.1 4B model ✅ Scale the evaluation to multiple GPUs NVIDIA NeMo Evaluator microservice simplifies the end-to-end evaluation of generative AI applications, including LLM evaluation, retrieval-augmented generation (RAG) evaluation, and AI agent evaluation with an easy-to-use API. It provides LLM-as-a-judge capabilities, along with a comprehensive suite of LLM benchmarks and LLM metrics for a wide range of custom tasks and domains, including reasoning, coding, and instruction-following. Get started: ✅ Follow the official documentation to scale using NIM Operator: https://docs.nvidia.com/nim-operator/latest/service.html#configuring-horizontal-pod-autoscaling ✅ Download included Jupyter notebooks to replicate the evaluation workflows in your own environment: https://github.com/NVIDIA/GenerativeAIExamples/blob/main/nemo/Evaluator/LLMAsAJudge/LLM%20As%20a%20Judge.ipynb ✅ Learn more about NeMo Evaluator: https://nvda.ws/414aHXJ 00:00 - Introduction 01:00 - Building a Manifest for a NIM 01:52 - Run Command Get Pods 3:00 - Standard NeMo Evaluator Notebook 4:03 - Creating a Job 4:22 - Changing Configurations 5:00 - Apply YAML File to NIM Operator Instance