
RAG Pipeline: 7 Iterations Explained!
Cyril Imhof
Discord: https://discord.gg/Svxajdh9 In this video, we explore how to build an advanced AI agent that can intelligently query structured data—such as spreadsheets and databases—by combining the strengths of relational databases and vector stores. While traditional vector search is great for semantic matching, it often struggles with structured queries like summing orders or filtering by specific fields. To solve this, we introduce a hybrid approach that leverages natural language queries (NLQ) on structured data stored in Supabase. You’ll learn how to seamlessly ingest data from multiple sources—including CSVs, Excel files, and Google Sheets—into a unified relational model using a JSONB column format. This flexible design removes the need for separate tables for every data schema and enables fast, precise SQL queries that your AI agent can dynamically generate. The agent intelligently decides when to query vector embeddings for unstructured insights or execute SQL for structured analytics. Through a live demo, we show the agent in action—analyzing orders by country, calculating totals, and identifying top-performing ad platforms. We also cover the full data ingestion pipeline, demonstrating how to handle new, updated, and deleted files from Google Drive, extract and synchronize schema and data, and keep Supabase up to date. Finally, we dive into how to configure the AI agent with the right tools, prompts, and schema-based constraints to ensure reliable, context-aware responses. You’ll also learn how to define views for complex data setups and when to use vector search vs. relational queries for the best results.