
RAG Pipeline: 7 Iterations Explained!
Cyril Imhof
In this video, we continue the RAG Foundations series by going hands-on with Vector Search in Milvus.We’ll create a collection, insert embeddings using a free Hugging Face model, explore the Milvus dashboard, and perform both scalar and vector similarity queries in real time. This builds directly on Part 1, where we set up Milvus locally. If you haven’t watched that, start here: ▶ RAG Foundations #1 – Install & Run Milvus (Vector Database for LLMs, Free & Local) https://youtu.be/EUnR4JrS0gc What You’ll Learn * What vector search is and why LLMs rely on it * How to generate embeddings using a free, open model * How to store and index vectors inside Milvus * How similarity search works under the hood * How vector search forms the foundation of Retrieval-Augmented Generation (RAG) Why This Matters RAG is how we enable LLMs to use external knowledge without retraining.Vector databases like Milvus make retrieval fast, scalable, and semantically meaningful, which is essential for real-world applications. Code & Resources - Course: https://www.youtube.com/playlist?list=PLCiTDJays9rXM9EDyIZAQw9bqn5NUGyom - Code: https://github.com/Aditya-Singh-SSJ2/RAG-Foundations - [Explore] Applied AI: Hugging Face Transformers Tutorial | Sentiment Analysis with BERT & Pipelines : https://youtu.be/KGUmhQMUBgo Chapters: 0:00 Intro + What We’re Building 0:45 Conda Environment Setup 2:55 Notebook & Dataset Overview 4:10 Create Database + Collection in Milvus 11:55 Insert Embeddings & Query Data (Scalar + Vector Search) 18:55 Cleanup + Reset 21:30 How This Powers RAG + Next Episode Next in the Series In Part 3, we’ll connect Milvus to Azure OpenAI and build a working RAG pipeline end-to-end. Call to Action If you found this helpful, drop a comment and tell me which RAG component you want explained next.Your feedback drives what comes next in this series.