Loading video player...
From WhatsApp Chat to AI Search Engine -- with stunning vector visualizations! In this hands-on tutorial, I transform a real WhatsApp group chat with 56,000+ messages into a fully functional AI-powered search engine using a complete RAG (Retrieval-Augmented Generation) pipeline. You'll see how vector embeddings look in 2D and 3D space with interactive t-SNE visualizations -- watch semantic clusters form right before your eyes! VECTOR VISUALIZATION HIGHLIGHTS: -- See 8,665 text chunks plotted as points in 2D and 3D vector space -- Compare how 3 different embedding models (MPNet, MiniLM, BGE) cluster the same data differently -- Interactive Plotly plots -- rotate, zoom, hover to inspect individual conversations -- Color-coded by time (Viridis scale) -- watch topic evolution over 3 years -- Discover natural semantic clusters where related conversations group together What You'll Learn: - Parse and clean WhatsApp chat exports with Python - Two chunking strategies compared: conversation-aware vs sliding window - Generate embeddings with 3 Sentence Transformer models: all-mpnet-base-v2 (768-dim, highest quality) all-MiniLM-L6-v2 (384-dim, fastest) bge-small-en-v1.5 (384-dim, retrieval-optimized) - Store and query vectors in ChromaDB -- a persistent vector database - VISUALIZE vector embeddings with t-SNE in interactive 2D and 3D plots - Understand WHY different models create different cluster patterns - Run local LLM summarization with Ollama (Llama 3.1) -- 100% private - Build a complete RAG pipeline: Retrieve, Augment, Generate Tech Stack: -- Python 3 + Jupyter Notebook -- Sentence Transformers (Hugging Face) -- ChromaDB (Open-Source Vector Database) -- Ollama (Local LLM -- Llama 3.1) -- Plotly (Interactive 2D and 3D Visualizations) -- scikit-learn (t-SNE Dimensionality Reduction) -- Matplotlib -- OpenAI Python Client (Ollama-compatible) Useful Resources: ChromaDB: https://docs.trychroma.com/ Sentence Transformers: https://www.sbert.net/ Ollama: https://ollama.ai/ Plotly Python: https://plotly.com/python/ t-SNE Explained: https://jmlr.org/papers/v9/vandermaaten08a.html Key Takeaways: 1. Conversation-aware chunking beats naive sliding windows for chat data 2. Different embedding models cluster data differently -- visualizations prove it 3. t-SNE lets you SEE your vector space -- clusters equal semantically related topics 4. 768-dim embeddings (MPNet) form more distinct clusters than 384-dim models 5. Local LLMs via Ollama equal private RAG with zero data leaving your machine By the Numbers: -- 56,251 raw messages into 49,168 filtered into 8,665 semantic chunks -- 741 unique chat participants -- 3 embedding models compared side-by-side -- 768-dim vs 384-dim vector spaces visualized -- 2D + 3D interactive t-SNE plots -- 100% local execution -- no cloud APIs, no data leaks Who is this for? -- ML/AI engineers learning about RAG architectures -- Developers building search systems with vector databases -- Anyone curious about how embeddings and vector spaces actually look -- Data scientists exploring NLP and text embeddings -- Privacy-conscious developers who want on-device AI If you found this useful, please LIKE, SUBSCRIBE, and SHARE! Drop a comment with what data YOU'd like to build a RAG pipeline on! #RAG #VectorEmbeddings #ChromaDB #Visualization #SentenceTransformers #Ollama #Python #AI #Tutorial