Loading video player...
## Summary In this video, I delve into the fascinating world of capacity planning as it pertains to vector embeddings in SQL Server databases. I start by comparing the size of a typical Stack Overflow post table from 2010 to its corresponding embeddings table, revealing just how much space these embeddings can consume—eight and a half gigs compared to six and a quarter gigs for the original data. This comparison sets the stage for an exploration of vector sizes, which depend on the number of dimensions in your chosen model. I also discuss the implications of different models and their embedding sizes, touching on both theoretical maximums and practical considerations. Additionally, I explore how vector indexes work differently from traditional B-tree indexes, emphasizing the importance of columnstore indexes for optimizing space usage. Finally, I provide some practical tips for managing database space when working with large embeddings, including the use of clustered columnstore indexes to achieve significant savings. ## Topics `SQL Server`, `Capacity Planning`, `Vector Embeddings`, `Post Table`, `Stack Overflow`, `Database Size`, `Storage Vendors`, `Model Selection`, `Float32 Data`, `Columnstore Indexes`, `Vector Indexes`, `Graph Structure`, `Compression Techniques`, `Row and Page Compression`, `AI Ready Course`, `Coupon Code AI Ready` ## Chapters - **00:00:00** - Introduction - **00:00:32** - Embedding Size Comparison - **00:01:04** - Post Table vs Post Embeddings Table - **00:01:53** - Projection and Estimation - **00:02:17** - Embeddings Column Details - **00:02:34** - Vector Storage Format - **00:03:02** - Vector Size Dependence - **00:03:31** - Model-Specific Sizes - **00:04:29** - Vector Index Overview - **00:05:04** - Graph Structure of Vector Index - **00:05:10** - Edge Table Size - **00:05:29** - Controlling Embedding Sizes - **00:06:00** - Columnstore Indexing - **00:06:37** - Interoperability Issues - **00:06:42** - Currently Available Features - **00:07:03** - Capacity Planning Tips - **00:07:14** - Clustered Columnstore Index - **00:07:41** - Course Promotion ━━━━━━━━━━━━━━━━━━━━━━━━━━ 📚 TRAINING & COURSES ━━━━━━━━━━━━━━━━━━━━━━━━━━ Get AI-Ready With Erik https://training.erikdarling.com/get-ai-ready-with-erik?coupon=AIREADY SQL Server Performance Engineering Course https://training.erikdarling.com/sql-server-performance-engineering?coupon=ENGINEERING Learn T-SQL with Erik https://training.erikdarling.com/learn-t-sql-with-erik?coupon=ADVANCEDTSQL Everything Bundle: https://training.erikdarling.com/?coupon=SPRINGCLEANING ━━━━━━━━━━━━━━━━━━━━━━━━━━ 🛠️ CONSULTING & SERVICES ━━━━━━━━━━━━━━━━━━━━━━━━━━ Need SQL Server performance help? https://training.erikdarling.com/sqlconsulting ━━━━━━━━━━━━━━━━━━━━━━━━━━ 💬 CONNECT ━━━━━━━━━━━━━━━━━━━━━━━━━━ Ask questions at Office Hours https://erikdarling.com/officehours/ Become a channel member https://www.youtube.com/@ErikDarlingData/join