Loading video player...
Struggling to move your RAG (Retrieval-Augmented Generation) demo into production? You're not alone. While building a basic RAG is easy, achieving high accuracy, low latency, and manageable costs at scale is the real challenge. In this deep dive, we move beyond the basics to focus on the most critical component: Retrieval. We'll provide a practical framework for thinking about RAG as a system, scoping your use case, and choosing the right retrieval architecture for your needs. 0:00 - Introduction: Why RAG Fails in Production 3:33 - Framework: How to Scope Your RAG Project 8:52 - Retrieval Method 1: BM25 (Lexical Search) 12:24 - Retrieval Method 2: Embedding Models (Semantic Search) 22:19 - Key Technique: Using Rerankers to Boost Accuracy 25:16 - Best Practice: Building a Hybrid Search Baseline 29:20 - The Next Frontier: Agentic RAG (Iterative Search) 37:10 - Key Insight: The Surprising Power of BM25 in Agentic Systems 41:18 - Conclusion & Final Recommendations Get the: References: https://github.com/rajshah4/LLM-Evaluation/blob/main/presentation_slides/links_RAG_Oct2025.md Slides: https://github.com/rajshah4/LLM-Evaluation/blob/main/presentation_slides/RAG_Oct2025.pdf ━━━━━━━━━━━━━━━━━━━━━━━━━ ★ Rajistics Social Media » ● Home Page: http://www.rajivshah.com ● LinkedIn: https://www.linkedin.com/in/rajistics/ ● Reddit: https://www.reddit.com/r/rajistics/ ━━━━━━━━━━━━━━━━━━━━━━━━━