Implementing Vector Search in Your Application with SQL Server 2025 | Data Exposed: MVP Edition | DailyDevLists

Loading video player...

Full Transcript

2,692 words • EN

Learn about some real use cases for

vector search in SQL Server 2025 and see

a cool demo this week on Data Exposed

MVP edition.

Hi, I'm Anna Hoffman and welcome to this

episode of Data Exposed. We're going to

get right into it to this MVP edition.

Today I have Joey uh joining me. Uh

Joey, can you tell us a little bit about

what you do? Yeah, thanks Anna. I'm Joey

Denton. I'm a principal consultant and

principal cloud architect and I work

with all sorts of cloud and data

solutions. I've been working a lot with

SQL Server 2025. Uh you may have read

some of the stuff I've written at

redmondmag.com

uh talking about some of these. I

actually wrote a column a couple months

ago on VEC research which we're going to

be talking about today.

>> Awesome. Cool. Well, excited to learn

more and we'll definitely put a link to

that uh in the description if people

want to go read it. uh for SQL Server

2025. Obviously there's uh support for

vector but before we get into all that

like I just want to get your take like

what are people actually using vector

for and you know what are you seeing

with your customers or in the community?

>> Yeah, there's so much buzz good and bad

in a in IT world about AI and developers

too. uh there are a lot of use cases

that like don't necessarily make the

most sense or might might be weird or

impractical, but man, vector search is

something that's really useful. too. One

of the things I was struggling with on

the current project I'm on with a client

like I've actually implemented this in

the last couple weeks is they're they

have kind of product catalog data and

one of the the descriptions we have are

really pretty short and they're not as

you know high quality as we'd like and

one of the suggestions was maybe going

out using

AI to enrich that data and that's that's

really dicey because you're not

inherently getting data from good

sources.

But when you have data already that's in

your database and you've trusted it,

what vector search can really enable you

is to to do like different searches that

go beyond full text. And I'll just give

you a basic example of what we ran into.

We had customers uh we had a a trial a

trial customer searching electric wire

versus electrical wire and they were

getting vastly different results just

based on those the two character

difference or even in some cases with

portal things because we were using you

know full text search the kind of normal

thing but full text search is imperfect

no matter what database you're in it's

it's not really ideal and while it was a

little bit of a pain point to go through

the vectorization process and adapt our

code to that It has made our search way

more accurate, uh, way more rich, and we

we just have a lot more power to doing

it. And frankly, there's not a whole lot

of changes needed to your application.

You can still have a stored procedure

that's going to do that search. You have

to make some changes to the stored

procedure so that does vectorization,

but your app doesn't need to change at

all. And you can plug in a whole new

kind of search algorithm that's way more

powerful.

>> Awesome. Cool. So it's basically like

improving your existing applications

without having to do like a big

rearchitecture or redesign or anything

like that.

>> Always the best way to do it, right?

>> Absolutely. Cool. Well, I'd love to uh

you know learn more about it.

>> Just one math question for you. What do

you know about cosine distance?

>> I mean I took geometry and you know know

a little bit about it.

>> Yeah. So I would like to thank Bob Ward

who sent me some math advice on this. Uh

but the basic idea is that a smaller

cosine distance is going to represent

higher similarity to the search terms.

If you want to learn way more about the

math, you know, you can put it in

copilot, you can you can Google cosine

distance and and lots of geometry and

trigonometry, but but that's about it.

So yeah, let's kind of get started with

this demo. Uh you can share my screen.

>> Let's take a look.

>> Yeah. So the first thing I want to tell

you is and and this I I think I did

reference this this repository in my

Redmond column. Uh this is from a fellow

MVP. I've contributed code to this as

well. Uh Anthony Nosentino has written

this and it's a full-on Docker solution

that not only deploys a SQL 2025

container, but it's running Adventure

Works LT, but it also deploys most

importantly uh an LLM embedding model

called Lama, which is one of Meta's open

source models. The cool thing about the

really cool thing about this is you

don't have to have anything else. A lot

of the demos you'll see in in in books

online and if you're using like Azure

SQL database, you have to have an Azure

res Azure AI foundry resource that that

does your converts that data into uh

into embedding. So let's let's kind of

take a walk through that code and and

we'll look at at kind of two different

things. So if you download Anony's code,

the first thing you'll do is do a

restore. uh we've already done that. But

if you look here on the code I'm sharing

on on lines 20 uh 20 through 31, what

we're doing, we're using some new syntax

that's in SQL 25 and it's it's also in

Azure SQL DB and we're creating an

external model. Uh you'll note the

location is model-we. Let me pull up a

terminal real quick.

And if I do a docker ps, you can see I

have a lama fastart sql one. And I have

a llama fast start model web. Uh you

need a couple of these containers and

Anony's code handles all of this for

you. But the the nice thing is your

model's running locally and you don't

have to pay anything most importantly.

Uh so so that is really nice. But um so

we'll do initially just a test to make

sure the model's working. Anna and I

were talking about the fact that I have

two execution errors here. That's

because I have both the Postgres and SQL

Server VS Code extensions installed.

Both nice extensions. They could play

better together.

So here I can see an embedding and this

is what an embedding looks like. Uh if

you look at this code, what we did was

embed the term test text. Ulama,

this this version of Lama generates 768

dimensions. So if you were to copy this,

you would see that there are 768 floats

uh that represents that text as as a

vector.

We also have a new data type in in SQL

Server called vector. And so here we're

creating a table. I've I've already

created this table, but uh you can see

embeddings is a vector type. And that

768 is the dimension. It's a little bit

weird um because it's not actually the

data size. It's much larger than 768

bytes. That number represents the number

of dimensions in that vector data.

>> And so is there something you have to do

specifically so that it works with with

the number of embeddings or when you

generate the embeddings so that it fits

>> when you generate the embeddings you

have to know and I forget how you do

this off the top of my head. I don't

know if you know uh the the model will

tell you how many dimensions there are.

>> Yeah. I don't remember like because I I

ran into this when I was first playing

with it. I was using uh one of the open

AAI models on Azure and I I was I was

like I had a 768 column and I was the

model was coming out with 153 uh 1536 I

think is the number is the is double 768

>> but yeah yeah it's it's based on the

output of the the model. So whatever

you're getting back

>> did you know?

>> Yeah. So here like we've created the

embeddings for this. So if we look at

this uh if we go ahead and run this

query here uh we can see our embeddings

and you may be like what's the use for

this and you can see here we've built a

chunk by con uh above here in in this

code we've built a chunk by

concatenating a bunch of columns. So, if

we look on line 66 and and highlight the

code I have here, we're taking the

product name, uh the category name, u

another name and the description and

making that a chunk. In the case of of

my customer data that we're going to

take a look at later, we're just taking

the company name and the and the uh the

description and and putting that into a

chunk. Generally speaking, you're going

to want to concatenate a few columns.

We're not using it here, but another fun

fact about SQL 2025 is

you can do this and it can concatenate

two columns. You no longer have to use a

plus sign. The VS Code extension I don't

think has been updated

or I'm on an older version, but you can

use ANC standard concatenation syntax.

So here we can say I'm looking for a red

bike and I don't want to spend a lot

and we can see that we get a list of red

bikes that that are backed. Uh I don't

have the price here but presumably

they're cheaper ones. We see this

distance as I mentioned that lower

number is better. We've also defined

that here on line 89 by saying vector

distance uh specifying cosine. We have a

couple of other options that we can use.

Most everything I've seen and done has

used cosine distance. We're ordering by

distance. That's ascending. Uh we want

that to be a little bit uh the lower

number is better. So closer value. And

of course we have a top there. Uh if you

don't have a top with your order by,

it's going to return everything.

>> Yeah.

>> Also, we have one feature I'm going to

show. We're not going to super demo

because we'd have to go deep in

execution plans. Uh there's also a kind

of a cool index type in SQL Server

called a vector index. It uses something

called disk aan which stands for near

approximate nearest neighbor. Uh and

it's just a little bit more efficient

way of searching this data. I will say

I'm not using those indexes in my

current workload. I am using a something

else and I'm kind of impressed. What I

want to show you though is a little bit

different uh similar code that we're

using. So here uh you can see we're

passing in a a parameter that's just the

search text. Then we're then we're

generating embeddings for that search

text. So remember we have to compare the

embeddings in the database to the

embeddings from our from our search

text. And so here we're doing the same

thing where we're doing vector distance

cosine but instead of doing a top I'm

actually doing a vector distance in my

wear clause. And this is the data we've

been working with this week with my

client. And one of the things we've been

balancing out uh the lowest number we've

seen in the data is is roughly about 0.2

and we've been balancing out quality of

results with that cosign distance

number. So we started out at about 045

and we we've decided to kind of adjust

down to to 37. So like here I'm

searching titanium screws and at 37 I

think this is going to return about 50

results.

Maybe I do need one of those fancy

indexes.

There we go. So, we can see 0.99 is or 0

29 is our lowest. Uh, and everything

here is fasteners, uh, titanium

fasteners. So, all the various screw

types. And if we go down to the bottom,

we still see titanium dental implant

components. So, it's close enough to

titanium screws that I I feel pretty

comfortable with that. Uh beyond that,

you start to kind of get into some

weirdness with with how text text

reacts. So,

>> and is this like 37? Like it's probably

subjective depending on what it is that

you're actually searching for, what your

base is, and how different types of

results you're willing to accept. Is

that fair? How did you come up with the

37?

>> Uh in this case, like in SQL Server does

normalize those values, but in this

case, it was really just trial and

error. Mhm.

>> Uh like it was it was we were kind of

going uh we started that at at I can't

remember where I started that. I I think

I think I started that and I returned

all the rows because I was being stupid

and didn't think about what I was doing.

And then uh then I put a wear clause at

like 0.5 and we looked at the the

bottom, you know, say 10% of the

results. We could see they were really

irrelevant. We got we we and we just

kind of gradually tuned that down in a

few steps and and we got down to 38 37

it seemed like a happy spot.

>> Awesome. Cool. Anything else you want to

show us or tips and tricks or anything

like that?

>> Not so far. We're we're we're learning a

lot as we go along. I One of the things

that's interesting about Lama is it's

not the most accurate model. So if

you're looking for really tight

precision, I think there are better,

bigger, newer models to use. But at the

same time, I think in our use case, it's

probably going to be enough for to

generate high enough quality data and

results that we don't feel the need to

switch to anything else. And in like in

in our production environment, we're

running it in our Kubernetes cluster and

just talking to it that way instead of

not running it on a local machine or

anything.

>> Nice. Awesome. That's super interesting

to hear that like you've made you've

gotten like good enough without having

to go to some of the newer stuff and I

think uh you know we hear a lot of

discussion about that but hopefully

that's good because it means people

aren't going to have to be like updating

once they deploy this out uh

>> yeah I mean it's it's easy enough to

update to a to a newer model because you

can just switch in that in that database

setting that we we looked at but it's

not it it really depends on your needs

if you you really need a bleeding edge

model uh you know I I think for a lot of

use cases where you're you're looking at

things like product catalog data, it's

it's going to be fine if you use an

older model.

>> Awesome. Cool. Well, Joey, thanks so

much for coming on the show. I learned a

lot. I think our viewers probably did,

too. Uh viewers, if you like this

episode, go ahead, give it a like, leave

us a comment. Let us know what you're

going to do with vector search and if

you've tried any of these new

capabilities in SQL Server 2025. And we

hope to see you next time on Data

Exposed.

Implementing Vector Search in Your Application with SQL Server 2025 | Data Exposed: MVP Edition

Microsoft Developer

20 days ago

13:59

RAG & Vector Search

Rank #2

Description

Going a level deeper than a basic semantic search talk, we’ll explore semantic search, how to implement it in your app, and how to ingest data at scale to support your application’s AI needs. Chapters: 0:00 Introduction 3:50 Demo ✅ Resources: VECTOR_SEARCH (Transact-SQL) (Preview): https://learn.microsoft.com/en-us/sql/t-sql/functions/vector-search-transact-sql?view=sql-server-ver17 Vector search and vector indexes in the SQL Database Engine: https://learn.microsoft.com/en-us/sql/sql-server/ai/vectors?view=sql-server-ver17 Github: https://github.com/Azure-Samples/azure-sql-db-vector-search About Joseph D'Antoni: Joseph D'Antoni is a Principal Consultant at Denny Cherry and Associates Consulting. He is recognized as a VMWare vExpert and a Microsoft Data Platform MVP, and has over 20 years of experience working in both Fortune 500 and smaller firms. He has worked extensively on database platforms and cloud technologies and has specific expertise in performance tuning, infrastructure, and disaster recovery. About MVPs: Microsoft Most Valuable Professionals, or MVPs, are technology experts who passionately share their knowledge with the community. They are always on the "bleeding edge" and have an unstoppable urge to get their hands on new, exciting technologies. They have very deep knowledge of Microsoft products and services, while also being able to bring together diverse platforms, products and solutions, to solve real world problems. MVPs make up a global community of over 4,000 technical experts and community leaders across 90 countries/regions and are driven by their passion, community spirit, and quest for knowledge. Above all and in addition to their amazing technical abilities, MVPs are always willing to help others - that's what sets them apart. Learn more: https://aka.ms/mvpprogram 📌 Let's connect: Twitter: Anna Hoffman, https://twitter.com/AnalyticAnna Twitter: AzureSQL, https://aka.ms/azuresqltw 🔴 To watch other MVP Edition episodes, see our playlist: https://aka.ms/dataexposedmvps To check out even more Data Exposed episodes, see our playlist: https://aka.ms/dataexposedyt 🔔 Subscribe to our channels for even more SQL tips: Microsoft Azure SQL: https://aka.ms/msazuresqlyt Microsoft SQL Server: https://aka.ms/mssqlserveryt Microsoft Developer: https://aka.ms/microsoftdeveloperyt #AzureSQL #SQLServer

Watch on YouTube