Loading video player...
RAG isn’t a one-and-done trick—it’s an iterative value-extraction loop. In this episode, Rohan and Nisaar break down how Retrieval-Augmented Generation really works in production: retrieval + generation, context injection, meta filtering by titles and embeddings, insertion-time filters, and semantic re-ranking. We also touch on local models (JMA), using real-world financial slices (cash flow swings, TTM views) as a thinking tool for evaluating pipelines, and how to design RAG systems that stay robust as your corpus scales. What you’ll learn Why RAG is an iterative process (not just a single query) How meta filters (titles, doc types) and embedding filters work together Context injection patterns that boost answer quality Insertion-time filtering + edge numbers to keep indexes clean When to re-rank and how to handle “no hits” gracefully Timestamps 00:00 Intro: What is RAG and why it matters (retrieval ➜ generation) 00:06 The iterative value-extraction loop (accuracy through repetition) 04:50 Reading results like an engineer: quarterly deltas & cash-flow swings 05:37 Local models that work (JMA) + TTM metrics as a lens for performance 07:45 From extraction to vectors: storing text & visuals for fast Q&A 09:49 Meta filtering by titles/doc types for real-time retrieval 11:10 Context injection + pipeline backup & clean response formats 13:22 Embedding-level meta filters & handling knowledge cutoff/no-hits 14:02 Insertion-time filters (titles/edge numbers) + semantic re-ranking 15:30 Wrap-up & key takeaways Tags (copy-paste, comma-separated) RAG, Retrieval Augmented Generation, retrieval augmented generation, vector database, embeddings, semantic search, re-ranking, reranking, meta filtering, context injection, value extraction, information retrieval, knowledge base, document retrieval, prompt engineering, local LLMs, JMA model, vector store, vector indexing, chunking strategies, query understanding, pipeline design, knowledge cutoff, production AI, AI engineering, GenAI, Large Language Models, LLMs, NLP, AI podcast, AIBROS, Rohan, Nisaar Hashtags #RAG #GenerativeAI #LLM #AIEngineering #VectorDatabases #SemanticSearch #OpenSource #MachineLearning #AIBROS #TechPodcast