Loading video player...
Learn about some real use cases for
vector search in SQL Server 2025 and see
a cool demo this week on Data Exposed
MVP edition.
Hi, I'm Anna Hoffman and welcome to this
episode of Data Exposed. We're going to
get right into it to this MVP edition.
Today I have Joey uh joining me. Uh
Joey, can you tell us a little bit about
what you do? Yeah, thanks Anna. I'm Joey
Denton. I'm a principal consultant and
principal cloud architect and I work
with all sorts of cloud and data
solutions. I've been working a lot with
SQL Server 2025. Uh you may have read
some of the stuff I've written at
redmondmag.com
uh talking about some of these. I
actually wrote a column a couple months
ago on VEC research which we're going to
be talking about today.
>> Awesome. Cool. Well, excited to learn
more and we'll definitely put a link to
that uh in the description if people
want to go read it. uh for SQL Server
2025. Obviously there's uh support for
vector but before we get into all that
like I just want to get your take like
what are people actually using vector
for and you know what are you seeing
with your customers or in the community?
>> Yeah, there's so much buzz good and bad
in a in IT world about AI and developers
too. uh there are a lot of use cases
that like don't necessarily make the
most sense or might might be weird or
impractical, but man, vector search is
something that's really useful. too. One
of the things I was struggling with on
the current project I'm on with a client
like I've actually implemented this in
the last couple weeks is they're they
have kind of product catalog data and
one of the the descriptions we have are
really pretty short and they're not as
you know high quality as we'd like and
one of the suggestions was maybe going
out using
AI to enrich that data and that's that's
really dicey because you're not
inherently getting data from good
sources.
But when you have data already that's in
your database and you've trusted it,
what vector search can really enable you
is to to do like different searches that
go beyond full text. And I'll just give
you a basic example of what we ran into.
We had customers uh we had a a trial a
trial customer searching electric wire
versus electrical wire and they were
getting vastly different results just
based on those the two character
difference or even in some cases with
portal things because we were using you
know full text search the kind of normal
thing but full text search is imperfect
no matter what database you're in it's
it's not really ideal and while it was a
little bit of a pain point to go through
the vectorization process and adapt our
code to that It has made our search way
more accurate, uh, way more rich, and we
we just have a lot more power to doing
it. And frankly, there's not a whole lot
of changes needed to your application.
You can still have a stored procedure
that's going to do that search. You have
to make some changes to the stored
procedure so that does vectorization,
but your app doesn't need to change at
all. And you can plug in a whole new
kind of search algorithm that's way more
powerful.
>> Awesome. Cool. So it's basically like
improving your existing applications
without having to do like a big
rearchitecture or redesign or anything
like that.
>> Always the best way to do it, right?
>> Absolutely. Cool. Well, I'd love to uh
you know learn more about it.
>> Just one math question for you. What do
you know about cosine distance?
>> I mean I took geometry and you know know
a little bit about it.
>> Yeah. So I would like to thank Bob Ward
who sent me some math advice on this. Uh
but the basic idea is that a smaller
cosine distance is going to represent
higher similarity to the search terms.
If you want to learn way more about the
math, you know, you can put it in
copilot, you can you can Google cosine
distance and and lots of geometry and
trigonometry, but but that's about it.
So yeah, let's kind of get started with
this demo. Uh you can share my screen.
>> Let's take a look.
>> Yeah. So the first thing I want to tell
you is and and this I I think I did
reference this this repository in my
Redmond column. Uh this is from a fellow
MVP. I've contributed code to this as
well. Uh Anthony Nosentino has written
this and it's a full-on Docker solution
that not only deploys a SQL 2025
container, but it's running Adventure
Works LT, but it also deploys most
importantly uh an LLM embedding model
called Lama, which is one of Meta's open
source models. The cool thing about the
really cool thing about this is you
don't have to have anything else. A lot
of the demos you'll see in in in books
online and if you're using like Azure
SQL database, you have to have an Azure
res Azure AI foundry resource that that
does your converts that data into uh
into embedding. So let's let's kind of
take a walk through that code and and
we'll look at at kind of two different
things. So if you download Anony's code,
the first thing you'll do is do a
restore. uh we've already done that. But
if you look here on the code I'm sharing
on on lines 20 uh 20 through 31, what
we're doing, we're using some new syntax
that's in SQL 25 and it's it's also in
Azure SQL DB and we're creating an
external model. Uh you'll note the
location is model-we. Let me pull up a
terminal real quick.
And if I do a docker ps, you can see I
have a lama fastart sql one. And I have
a llama fast start model web. Uh you
need a couple of these containers and
Anony's code handles all of this for
you. But the the nice thing is your
model's running locally and you don't
have to pay anything most importantly.
Uh so so that is really nice. But um so
we'll do initially just a test to make
sure the model's working. Anna and I
were talking about the fact that I have
two execution errors here. That's
because I have both the Postgres and SQL
Server VS Code extensions installed.
Both nice extensions. They could play
better together.
So here I can see an embedding and this
is what an embedding looks like. Uh if
you look at this code, what we did was
embed the term test text. Ulama,
this this version of Lama generates 768
dimensions. So if you were to copy this,
you would see that there are 768 floats
uh that represents that text as as a
vector.
We also have a new data type in in SQL
Server called vector. And so here we're
creating a table. I've I've already
created this table, but uh you can see
embeddings is a vector type. And that
768 is the dimension. It's a little bit
weird um because it's not actually the
data size. It's much larger than 768
bytes. That number represents the number
of dimensions in that vector data.
>> And so is there something you have to do
specifically so that it works with with
the number of embeddings or when you
generate the embeddings so that it fits
>> when you generate the embeddings you
have to know and I forget how you do
this off the top of my head. I don't
know if you know uh the the model will
tell you how many dimensions there are.
>> Yeah. I don't remember like because I I
ran into this when I was first playing
with it. I was using uh one of the open
AAI models on Azure and I I was I was
like I had a 768 column and I was the
model was coming out with 153 uh 1536 I
think is the number is the is double 768
>> but yeah yeah it's it's based on the
output of the the model. So whatever
you're getting back
>> did you know?
>> Yeah. So here like we've created the
embeddings for this. So if we look at
this uh if we go ahead and run this
query here uh we can see our embeddings
and you may be like what's the use for
this and you can see here we've built a
chunk by con uh above here in in this
code we've built a chunk by
concatenating a bunch of columns. So, if
we look on line 66 and and highlight the
code I have here, we're taking the
product name, uh the category name, u
another name and the description and
making that a chunk. In the case of of
my customer data that we're going to
take a look at later, we're just taking
the company name and the and the uh the
description and and putting that into a
chunk. Generally speaking, you're going
to want to concatenate a few columns.
We're not using it here, but another fun
fact about SQL 2025 is
you can do this and it can concatenate
two columns. You no longer have to use a
plus sign. The VS Code extension I don't
think has been updated
or I'm on an older version, but you can
use ANC standard concatenation syntax.
So here we can say I'm looking for a red
bike and I don't want to spend a lot
and we can see that we get a list of red
bikes that that are backed. Uh I don't
have the price here but presumably
they're cheaper ones. We see this
distance as I mentioned that lower
number is better. We've also defined
that here on line 89 by saying vector
distance uh specifying cosine. We have a
couple of other options that we can use.
Most everything I've seen and done has
used cosine distance. We're ordering by
distance. That's ascending. Uh we want
that to be a little bit uh the lower
number is better. So closer value. And
of course we have a top there. Uh if you
don't have a top with your order by,
it's going to return everything.
>> Yeah.
>> Also, we have one feature I'm going to
show. We're not going to super demo
because we'd have to go deep in
execution plans. Uh there's also a kind
of a cool index type in SQL Server
called a vector index. It uses something
called disk aan which stands for near
approximate nearest neighbor. Uh and
it's just a little bit more efficient
way of searching this data. I will say
I'm not using those indexes in my
current workload. I am using a something
else and I'm kind of impressed. What I
want to show you though is a little bit
different uh similar code that we're
using. So here uh you can see we're
passing in a a parameter that's just the
search text. Then we're then we're
generating embeddings for that search
text. So remember we have to compare the
embeddings in the database to the
embeddings from our from our search
text. And so here we're doing the same
thing where we're doing vector distance
cosine but instead of doing a top I'm
actually doing a vector distance in my
wear clause. And this is the data we've
been working with this week with my
client. And one of the things we've been
balancing out uh the lowest number we've
seen in the data is is roughly about 0.2
and we've been balancing out quality of
results with that cosign distance
number. So we started out at about 045
and we we've decided to kind of adjust
down to to 37. So like here I'm
searching titanium screws and at 37 I
think this is going to return about 50
results.
Maybe I do need one of those fancy
indexes.
There we go. So, we can see 0.99 is or 0
29 is our lowest. Uh, and everything
here is fasteners, uh, titanium
fasteners. So, all the various screw
types. And if we go down to the bottom,
we still see titanium dental implant
components. So, it's close enough to
titanium screws that I I feel pretty
comfortable with that. Uh beyond that,
you start to kind of get into some
weirdness with with how text text
reacts. So,
>> and is this like 37? Like it's probably
subjective depending on what it is that
you're actually searching for, what your
base is, and how different types of
results you're willing to accept. Is
that fair? How did you come up with the
37?
>> Uh in this case, like in SQL Server does
normalize those values, but in this
case, it was really just trial and
error. Mhm.
>> Uh like it was it was we were kind of
going uh we started that at at I can't
remember where I started that. I I think
I think I started that and I returned
all the rows because I was being stupid
and didn't think about what I was doing.
And then uh then I put a wear clause at
like 0.5 and we looked at the the
bottom, you know, say 10% of the
results. We could see they were really
irrelevant. We got we we and we just
kind of gradually tuned that down in a
few steps and and we got down to 38 37
it seemed like a happy spot.
>> Awesome. Cool. Anything else you want to
show us or tips and tricks or anything
like that?
>> Not so far. We're we're we're learning a
lot as we go along. I One of the things
that's interesting about Lama is it's
not the most accurate model. So if
you're looking for really tight
precision, I think there are better,
bigger, newer models to use. But at the
same time, I think in our use case, it's
probably going to be enough for to
generate high enough quality data and
results that we don't feel the need to
switch to anything else. And in like in
in our production environment, we're
running it in our Kubernetes cluster and
just talking to it that way instead of
not running it on a local machine or
anything.
>> Nice. Awesome. That's super interesting
to hear that like you've made you've
gotten like good enough without having
to go to some of the newer stuff and I
think uh you know we hear a lot of
discussion about that but hopefully
that's good because it means people
aren't going to have to be like updating
once they deploy this out uh
>> yeah I mean it's it's easy enough to
update to a to a newer model because you
can just switch in that in that database
setting that we we looked at but it's
not it it really depends on your needs
if you you really need a bleeding edge
model uh you know I I think for a lot of
use cases where you're you're looking at
things like product catalog data, it's
it's going to be fine if you use an
older model.
>> Awesome. Cool. Well, Joey, thanks so
much for coming on the show. I learned a
lot. I think our viewers probably did,
too. Uh viewers, if you like this
episode, go ahead, give it a like, leave
us a comment. Let us know what you're
going to do with vector search and if
you've tried any of these new
capabilities in SQL Server 2025. And we
hope to see you next time on Data
Exposed.
Going a level deeper than a basic semantic search talk, we’ll explore semantic search, how to implement it in your app, and how to ingest data at scale to support your application’s AI needs. Chapters: 0:00 Introduction 3:50 Demo ✅ Resources: VECTOR_SEARCH (Transact-SQL) (Preview): https://learn.microsoft.com/en-us/sql/t-sql/functions/vector-search-transact-sql?view=sql-server-ver17 Vector search and vector indexes in the SQL Database Engine: https://learn.microsoft.com/en-us/sql/sql-server/ai/vectors?view=sql-server-ver17 Github: https://github.com/Azure-Samples/azure-sql-db-vector-search About Joseph D'Antoni: Joseph D'Antoni is a Principal Consultant at Denny Cherry and Associates Consulting. He is recognized as a VMWare vExpert and a Microsoft Data Platform MVP, and has over 20 years of experience working in both Fortune 500 and smaller firms. He has worked extensively on database platforms and cloud technologies and has specific expertise in performance tuning, infrastructure, and disaster recovery. About MVPs: Microsoft Most Valuable Professionals, or MVPs, are technology experts who passionately share their knowledge with the community. They are always on the "bleeding edge" and have an unstoppable urge to get their hands on new, exciting technologies. They have very deep knowledge of Microsoft products and services, while also being able to bring together diverse platforms, products and solutions, to solve real world problems. MVPs make up a global community of over 4,000 technical experts and community leaders across 90 countries/regions and are driven by their passion, community spirit, and quest for knowledge. Above all and in addition to their amazing technical abilities, MVPs are always willing to help others - that's what sets them apart. Learn more: https://aka.ms/mvpprogram 📌 Let's connect: Twitter: Anna Hoffman, https://twitter.com/AnalyticAnna Twitter: AzureSQL, https://aka.ms/azuresqltw 🔴 To watch other MVP Edition episodes, see our playlist: https://aka.ms/dataexposedmvps To check out even more Data Exposed episodes, see our playlist: https://aka.ms/dataexposedyt 🔔 Subscribe to our channels for even more SQL tips: Microsoft Azure SQL: https://aka.ms/msazuresqlyt Microsoft SQL Server: https://aka.ms/mssqlserveryt Microsoft Developer: https://aka.ms/microsoftdeveloperyt #AzureSQL #SQLServer