Loading video player...
You've been hearing a lot about Azure
SQL database, Vector, and Azure OpenAI.
Learn how you can use them all together
in a demo and a sample that's available
now this week on Data Exposed MVP
edition.
[Music]
Hi, I'm Anna Hoffman and welcome to this
episode of Data Exposed MVP edition.
Today we have a special episode because
we're talking about AI, everybody's
favorite topic. Um, and today on MVP
edition, we have Indira joining us uh
today. And as we get started, can you
tell us a little bit about yourself and
what you do? Thanks Anna for this
invite. uh what I do is uh at the moment
I'm doing data architecture and I have
uh spent about 20 more than 20 years in
the data space and uh I'm really
interested in AI and we have been
working on AI um models and different uh
projects on AI for the last 3 years I've
been really uh exploring these and then
enjoying working with AI uh and I'm
excited today about this um uh data
exposed event on Azure SQL DB and
OpenAI.
>> Awesome. Great. Well, we're happy to
have you on the show and we know you
have your own YouTube channel as well.
So, we'll put a link to that in our
description. But, you know, for now, I
say like let's get right into it. I'd
love to understand kind of your
scenario, your the way you're thinking
about Azure SQL database and OpenAI.
>> Yes, let's start.
>> Okay. So uh today what I'm going to show
you is my session is about implementing
rag using Azure SQL DB and Azure OpenAI.
So let's get into the agenda. I'll
quickly explain what uh RAG is in terms
of uh Azure SQL DB and Azure OpenAI uh
and then I'll get straight into the
demo. So uh in the in terms of uh ret
retrieval augmented generation uh with
respect to uh Azure SQL database and um
Azure OpenAI what happens is we have a
lot of data uh in um in Azure SQL and
when we do the queries we just write
normal queries but uh with the new uh
capability uh the we can plug in large
language models which are uh deployed in
um Azure OpenAI. Uh call them and then
we have a new uh vector data type that
that we can uh store in uh uh MSSQL
database um uh Azure SQL database and
then uh for example if I have a products
uh products data in my SQL then we can
extract all products that are similar to
the uh provided question. So provide
question in the sense when you ask when
you when you uh ask in uh in a natural
language uh um the um the what what the
uh question is let's say for example can
you give me similar models to Xbox or
something like that then what it does is
you can uh uh you can create a similar
similarity search and then that that
prompt will go to the open AI uh and
then um then of all the extracted uh
products determines the what are the
most appropriate um and then gives the
pro uh gives the answer to the provided
question in native language. So um that
is the whole process. So here you are uh
storing the vector uh embeddings in the
Azure SQL. So that's the specialtity of
this um uh rag using Azure SQL database.
So uh I'll get into the demo um demo.
But before we get into the demo, there
are some requirements. Obviously when
you are using Azure SQL database you
need to have Azure SQL database um uh
defined and uh also Azure OpenAI
resource um and all and you can uh
deploy any GPT model. So I have I think
GPT4 and GP4 mini um uh and also you
need to have a text model so that you
can uh use the text uh model to uh to
create the embeddings. So the text model
that uh I have used is the uh ADA model
uh that is available in uh Azure uh
OpenAI and then you also uh I create I
uh updated my database using a blob
storage where I stored the CSV files of
the uh of the data and then created a
credential and then uh I uh uh bulk
inserted the data into uh SQL database
which I'm going to show you. So that
those are the demo requirements and then
let's great get straight into the demo.
Um
so I have uh SQL ser uh server here
uh which I have connected using uh SQL
server management studio and um
uh and I have created a database called
Walmart. Um and then in the Walmart
database I have created a table using
the uh using this code here that you can
see um and the database and the table
name is Walmart e-commerce product
details. So I have got this um data from
the internet which uh I have um uh I
have stored the I have uh given the link
of the uh data uh it's it's on Kaggle
database and then after after creating
the database then here the the code that
I I have uh highlighted here is uh uh
creating the external source external
data source so that we can link the
external data source which is in on blob
storage um uh with a shared access
signature that you can see. um we we
create a credential and then we uh do
the bulk uh insert with um here if you
can see the uh code of the bulk insert
uh I am inserting from that particular
uh folder called Walmart and then from
that particular CSV and then the batch
size is uh thousand so it uh includes
the um it it inserts thousand rows per
um per uh each insert
And then after um loading so you can see
the um uh the table uh after loading the
uh so yeah we can see the um top 100 uh
from the Walmart uh data Walmart table
Walmart uh e-commerce product details
table and then here if you uh look at
the all the uh
all the columns You can see at the end
there is a embedding column which is the
uh which is the embedding so that uh
when you ask a question that embedding
column will be used uh by the Azure open
AAI to give you the similarity answers.
So after that the next step is creating
the uh credentials. Uh so um so that we
we can um uh create what what the uh
credential is for the um uh Azure OpenAI
instance so that the uh the instance is
used from the uh credentials from from
within the database. So um and then the
next step is creating the uh
transforming the search text into a
vector. So here uh you can see that the
the text asking being asked is anything
for a teenager boy passionate about
racing cars. So here uh you are uh
creating the uh you are running the um
uh while loop if there are any uh this
thing and then you create the HTTP
response and insert into the HTTP
response. Um so that is the response
that that it gives the search vector
gives uh which is in a JSON format. Um
and when you uh look at the
um table so you can see the HTTP
response table and uh uh the Walmart uh
uh e-commerce uh product details table
as well in the table list of tables. So
if if you look at the uh top one JSON
query so you can see the embeddings that
you can you are uh creating you have
created in the uh htt http response. uh
so the the the JSON query has passed
what and then how uh the number of uh
records in that table is 30 30,000 that
I have inserted.
So now this this code is uh uh getting
similar items. So here what uh this
query does is it creates a um uh uh
dbo.imaril
items table uh where you pass the um uh
and then that will be used and here this
one this query is you is chatting with
the data. So you declare a JSON array
and then you pass two prompts. One is
the system prompt and the and the other
prompt is the anything for a teenager
boy and then it gives you the um uh it
gives you the uh result in a again in a
JSON format.
So uh name and yeah you are a system uh
you are the system assistant who helps
yeah you can give you that. And then uh
the next one is uh uh uh you uh if you
run the entire uh query you get the
difference. So that the top result is
the JSON and the bottom result is the uh
actual u uh uh actual result and this
query is uh uh chat with the data
structured data. So here we are g
parsing the JSON query and then uh
running the um running the uh
details. So now we we get the uh similar
products uh which are which are there in
um as per the question. So, teenage boy
uh Sony PlayStation and uh you can uh if
I go back again so and pause this so you
can uh see that the Sony PlayStation and
Lego Racers uh scratch building toy set
uh building build your vehicle racing
car. So these are the similar things
that uh the um uh the open AI has
identified and given you the results and
uh based on the um the based on the uh
results that were stored in the sim in
the uh similar objects uh uh data table
in the uh in the similar objects table.
So that's what I wanted to show you. Uh
there is also uh if I go into the SQL
database, you can also create a uh store
procedure um called uh get embedding or
something. Um and then you can pass the
uh URL of the embedding of your
embedding model and also your credential
and then uh where where the if you
insert some new records call this get
get embedding um procedure then uh you
can uh create the you can create the
embedding based on uh based on your um
which is which is of a vector type which
uh of 1 36 and then uh you can store
that in the uh database in the into in
the embedding column that I have showed
you. So if I show you the uh
if I show you the
columns again so here
you can call this embedding uh get
embedding store procedure and then
create the embeddings. So that's what I
wanted to show you uh as uh this thing
and if you if I um this is the data set
that I have uh no this is not the data
set but the the there is a GitHub link
that I have given here. Um
see here so this is the uh GitHub link
that I have used where I got the uh
Walmart data and then there are clear
steps as well. Um, so maybe if I go in
this.
>> Awesome. Thanks, Andira. I feel like I
learned a lot by walking through your
demo. So, just to kind of summarize what
we saw is you imported an existing data
set which already had embedding. So, in
that case, you're using Azure SQL
database as a vector store. Now, then
what you were able to do is create store
procedures that call out to Azure OpenAI
and take in new requests. So like if
someone says find me something similar
to you know this Xbox thing for a gift
for racing cars then it converts that so
it's able to call out get the embeddings
for that and then you also showed how it
does a similarity search so it used I
think you s used cosign distance there
to go between
>> cosign yes
>> um the search and the items available in
the products list. Um, and then you also
showed us kind of full circle how you
could if you wanted uh create additional
embeddings maybe as new products come in
or some or something like that.
>> Um, so I think it's an awesome sample. I
would say you know as someone who's been
working in the AI space for some time
like uh do you have any final tips or
tricks for folks who are just getting
started with rag or using AI and SQL
databases?
So uh I would say that you uh you can
use different um uh rag techniques that
uh you that that are uh there. I usually
follow Pamela from Microsoft and she has
she really uh explains the rag concepts
very clearly. So maybe you also uh
follow Pamela Fox. uh she does uh really
good um videos and also I think the as
an as next steps uh if you are a data
person maybe you should uh look at this
public preview of uh um the native
vector support in Azure SQL database
where you have some uh uh examples as
well to go through uh and get started.
>> Awesome. Cool. Well, Indier, thanks so
much for coming on the show. Uh for our
users, if you like this episode, go
ahead, give it a like, give us a
comment, let us know what you think.
We'll put links to the GitHub sample uh
and some of the other things that we
talked about in the description. Note
that this is available in Azure SQL
database, SQL database and fabric and
SQL server uh 2025 which at the time of
recording uh all of these things are in
preview. Um so if you yeah if you like
the episode go ahead and give it a like.
Indira again thanks so much for coming
on this episode of Data Exposed and we
hope to see you next time on Data
Exposed.
>> Thank you. Thank you Anna.
[Music]
Learn how to implement Retrieval Augmented Generation (RAG) directly within Azure SQL Database using your existing data and OpenAI's GPT models. This demonstration shows how to easily build new or enhance existing applications with powerful AI capabilities. 0:00 Introduction 2:00 What is RAG 3:49 Demo + Requirements 13:50 Getting started ā Resources: http://github.com/Azure-Samples/azure-sql-dbchatbot About Indira Bandari: Indira has a Masters degree in Statistics and have over 20 years of experience in designing and developing datawarehouses and analytical solutions. She is a co-organiser of the NZ Business Intelligence - Power BI User Group and Auckland Auckland Artificial Intelligence Meetups. She is also an international Speaker at various events and conferences. Volunteering and sharing knowledge is her passion. In her pass time she teaches game development for primary school kids. She also started teaching about databases and Power BI to 10 -15 year old kids to give them a taste of technology at a young age About MVPs: Microsoft Most Valuable Professionals, or MVPs, are technology experts who passionately share their knowledge with the community. They are always on the "bleeding edge" and have an unstoppable urge to get their hands on new, exciting technologies. They have very deep knowledge of Microsoft products and services, while also being able to bring together diverse platforms, products and solutions, to solve real world problems. MVPs make up a global community of over 4,000 technical experts and community leaders across 90 countries/regions and are driven by their passion, community spirit, and quest for knowledge. Above all and in addition to their amazing technical abilities, MVPs are always willing to help others - that's what sets them apart. Learn more: https://aka.ms/mvpprogram š Let's connect: Twitter: Anna Hoffman, https://twitter.com/AnalyticAnna Twitter: AzureSQL, https://aka.ms/azuresqltw š“ To watch other MVP Edition episodes, see our playlist: https://aka.ms/dataexposedmvps To check out even more Data Exposed episodes, see our playlist: https://aka.ms/dataexposedyt š Subscribe to our channels for even more SQL tips: Microsoft Azure SQL: https://aka.ms/msazuresqlyt Microsoft SQL Server: https://aka.ms/mssqlserveryt Microsoft Developer: https://aka.ms/microsoftdeveloperyt #AzureSQL #SQLServer