Loading video player...
The Verscell AI SDK is a TypeScript
first toolkit for building AI features.
It streamlines text generation,
embeddings, and structured outputs. In
this course, you will learn how to use
the Verscell AI SDK to create and ship a
customer support agent that makes
autonomous decisions to either answer
questions based on your support docs or
search the web in real time. Mayo from
Scribba developed this course.
Welcome to the course on building a
customer support AI agent with VCEL AI
SDK and OpenAI. Now, over the past
couple of years, we've seen how AI has
transformed how we work. And one of the
most exciting recent developments is the
rise of AI agents. This allows the large
language model to not be static but be
able to trigger these different actions
and make more autonomous decision-m
thereby giving us more productive
results. In this course, you're going to
dive deep into the various strategies to
help you build a production ready AI
agent. We will be focusing on the use
case of customer support to make this
learning process easier for you. Now,
we've seen the use case of AI as a
chatbot where you ask a question and you
get a quick response. But as I said, AI
can go much further now with these
agentic capabilities because of the
ability to take action and call APIs and
get into databases and do things like
booking hotels online, interacting with
websites, debugging code, performing
deep research. And so now businesses are
able to use these AI agents to automate
complex tasks, synthesize information,
and deliver long- form insights. Now,
one of the most popular and powerful use
cases is customer support. Why? Because
customer support is at the heartbeat of
every company. This is where complaints,
bugs, and issues surface. and solving
them well is essential to keeping
customers loyal and happy. AI agents can
make a huge difference here because they
can first of all help to escalate and
resolve issues quickly. Second of all,
they automate responses to basic
queries. So the human experts only focus
their valuable time on the complex cases
that the model cannot confidently
resolve. Thirdly, they help to
personalize support by retrieving
details from past conversations or user
profiles. Again, this is very useful for
customer attention. And finally, they
can help the company analyze customer
interactions, uncover trends, problems,
and even new product ideas. So in this
course you will learn how to build
exactly this type of AI agent that
doesn't just generate text but
intelligently routes queries retrieves
real-time information and reduces
hallucination. So firstly you will learn
how to create logic so that the large
language model will intelligently
classify and route the user's query
based on their intent. Secondly, you
will learn how to build this AI agent
that doesn't only just know when, but
also how to retrieve relevant real-time
information. You will learn the basics
of using the popular VEL AI SDK, which
will help you to simplify your workflow
and speed up your development process.
You'll also learn the integration of
retrieval augmented generation for more
accurate answers. And building this
customer support AI agent in the end
will enable you to see and have an
application that is not just QA for
documents but also the ability to
perform web search and retrieve
real-time information from the web. And
so by the end of this course, you will
have a working customer support AI agent
that is a powerful real world
application with actual real world use
cases to dramatically improve the
support experience for the users of
whatever applications you're working on.
Now this is what it's going to look
like. Here we have asking a basic
question. How do I join the Scrimmer
Discord? And it's going to be trained on
Scrimmer's help documentation.
The results are shown below where the
question is asked and the user is able
to view the sources so they can see what
documents or what specific pages were
where the answer was retrieved from. And
likewise, web search showing the other
side of the autonomous decision the
agent can make where the agent is able
to go to the web to retrieve information
because they identify that the answer
was not available in the customer
support docs. Now before we dive in, you
should already know these. First of all,
you should know how to use the OpenAI
API, how to create an account, retrieve
the API, and just work with APIs in
general. Secondly, you should be
comfortable with just crafting basic
prompts. If you've done it on the chat
chip interface, that is fine. Third, you
should be comfortable with the concept
of retrieval augmented generation. Now,
we are going to go through a brief
refresher in this course, but I highly
recommend the scrimmer course if you are
completely a beginner at what retrieval
augmented generation is. Fourth, the
implementation of basic function calls.
You should be familiar with the idea of
function calling. And fifth, creating
basic express service for API routes.
Again, Scriber provides a course that
you can go through as a refresher. But
don't worry if you're not super expert.
I will walk you through the essentials
as we go along in this course. Now, this
is just a brief warning. This course is
quite advanced. Okay? So the code is
fairly complex and the challenges I've
set are really going to make you think
and work. So at the end of each scrim,
make sure you take time to play with the
code I have presented and really make
sure you understand it before moving on
to the subsequent challenge. With that
being said, my name is May Ashen. I'm
going to be in teacher in this course.
I'm an AI engineering educator. I'm also
the co-author of Learning Langchain
published by O'Reilly. If you want to
get my thoughts on technology, AI,
entrepreneurship, you can follow me at
Maya Ashin. With that being said, if
you're ready to start building powerful
AI agents that can make a real impact,
let's get started.
As discussed earlier, you should already
have a basic idea of how retrieval
augmented generation works and how it's
used to create this question answer
dynamic between a large language model
and a user. Let's quickly recap. So what
is rag? Rag is simply a mechanism where
we store information somewhere. This
could be a database or the web. And the
user asks a question. We retrieve the
most relevant pieces of information,
pass it to the large language models
context. the large language model is
able to give the final answer that's
contextually relevant. Now, why is this
important? This is important because
large language models are limited to
data they're trained on. So, if you ask
a question that's outside the training
window, it may hallucinate or say it
doesn't know. But with retrieval
generation or rag, we can provide fresh
contextual information so the model can
give accurate answers it otherwise
wouldn't know. So for example, if I ask
the question, who are the most recent
FIFA Club World Cup? The large language
model may not know, but with retrieval
generation, you can retrieve the fact
that Chelsea Football Club beat Paris
Sanjaman and then this is passed into
context of the model so it can generate
an accurate response. So where does this
external data come from? Well, this is
pulled from places like the web
database, local text files. And in this
course specifically, we'll be relying on
a vector database. This is a special
kind of database designed to work with
embeddings. Now, embeddings are simply
numerical representations of words,
phrases, or even images. So if you plot
them in this multi-dimensional space as
you can see in the diagram semantically
similar items will end up closer
together. And when you ask a question
what's actually happening under the hood
is we convert the text into these
embeddings and then we compare them
against one another in the database and
the closest ones are then retrieved as
context and then passed so they can be
used locally. Now, for the sake of this
course, we are going to be using
OpenAI's latest embedding model, text
embedding three small. It's fast, it's
accurate, and it's perfect for tasks
like search recommendations, in our
case, retrieving support documents
relevant to answer the user's question.
So, here's what to do. First of all,
visit openai.com.
Create an account if you haven't done
so. Get the API keys and then go back to
Scriber. Now, when you go back to
Scriber, in the bottom right corner,
click on the small gear icon and add a
new entry called open AI API key. This
is where you're going to paste the API
key and save. So, at this point, you
should understand what rag is, why it's
important, have a basic grasp of
embedding, and have your OpenAI key
ready to go. So, with that being said,
let's move on to the next lesson where
we'll run our first function to generate
embeddings and see what they look like
in action.
So, now we are in the code editor. I've
already pre-installed the relevant
libraries. At this point, you should
have already saved your API keys in the
edit environment at the bottom right.
And so if you go to the config, we've
essentially imported this OpenAI client
from the OpenAI library. Everything's
going to be pre-installed when you press
run. So it would install everything in
package.json, which is just essentially
this OpenAI library. And then it will
attempt to run start as well, which will
trigger this file. So in the config, it
checks if you've saved the environment
variable, the API keys, and if you
haven't, it's going to throw an error.
We create the client and then we import
into the main file. Now what we have
here is OpenAI has a function called
embeddings that allows you to create an
embedding based on input text and all
we're doing is passing this text will be
converted to an embedding into input and
then just console.log the results we get
back. So if I click run open here we get
the results. So this is the programmatic
version of what embeddings look like.
Essentially, embedding.data gives you an
object that includes these different
values. And it's important to see
because you can see that all these
unique numbers have been assigned to
this block of text. And in the case of
this particular embeddings model, there
are 1536 dimensions. So here you can see
three rows 6 7 8 9 and all the way down
to you hit 1,536
different items in the array. And so
this is essentially what the embeddings
look like for this block of text. Now
obviously if I change the text and when
you change just press control S or
command S to save the changes. Click run
and you'll see a different group of
numbers are generated because the text
is different. And of course, if you
eventually just want to access just the
embeddings which we will later on to
then pass on somewhere else, you
essentially would need to first grab the
first array here. So this is the only
object inside. So again, the first item.
Then you want to grab embedding dot
embedding and then zero. So if I save
control save should give value. There we
go. So now we have the full array of
embeddings and then we can pass this
assign this to a variable and then pass
it somewhere else. So you can play
around here changing the different text
just to have a feel for it. This is
essentially the function that creates
the embeddings. See you in the next
lesson.
Okay, so as discussed in previous
lessons, a big part of retrieval meta
generation is the ability to store data
that you need somewhere else. Data that
will be relevant to answering the user's
question. And at runtime, we convert the
question into embeddings. We retrieve by
comparing the embeddings to the
embeddings of stored data in a database.
and then we pull the relevant documents
back. In this course, we are going to
use superbase to be the vector store or
vector database that will store the
embeddings of these documents that you
eventually want to retrieve based on the
user's question. So, Superbase is
essentially a Postgress open-source
database, but most importantly, it has a
PG vector extension. This is just an
extension that essentially allows us to
work with embeddings at scale. Go to
superbase.com. Once you're in
superbase.com, follow the instructions
to sign up and create an account. And
then it's going to prompt you to create
a new organization. Just create a free
plan, personal, and just give your
organization a name. Once you give your
organization a name, you want to go in
the middle of the web page, click on
create new project, and then you'll be
given an interface similar to this.
Choose a region, East, US, or wherever
is kind of closest to you. Give a
database password you can remember and a
project name. Maybe just call it
Scrimmer customer support AI agent. Once
you've created the project and you've
created the organization, you're going
to see an interface like this. And what
you want to do is scroll all the way
down to the bottom left. Click on the
gear icon so you can go to the settings.
When you're in the settings under
project settings, move your mouse all
the way down to API keys. Click on API
keys. And then you're going to see this
interface with what's known as your
public key, which is safe to use in
browser, and your service ro key, which
essentially allows you to bypass raw
level security and access any data in
your database. For the sake of this
course, we're just going to take both of
them. So, make sure to copy both the
public key and the service ro key. So,
you click reveal and also copy that as
well. Once you have both of these
values, you want to come back, click on
the gear icon in the bottom right of
your scrim, and you're going to get the
interface here. You've already saved
your Open API key. So save your public
key as the value for superbase URL and
then superbase service
key place the value of the service ro
key here. So at this point you've set up
Superbase URL and the Superbase service
ro key. Now let's jump into the actual
code. So now at this point you should
have your OpenAI API key and your
Superbase service ro key. And so this
will allow you to create the Superbase
client so you can access the database.
As you can see, we've stored some dummy
text and this text can be replaced with
your own documents. But for the sake of
demonstration, I'm just showing how we
can take this, embed it, and then store
it somewhere that we can retrieve later
on. So, different topics, Magna Carta,
Apollo, the history of computing, and so
on and so forth. Here we have the logic
for asserting the information from this
text into the vex store. And here we
have the main function. Now before we
proceed, you need to go to
created_tables.sql
and copy this SQL command which will
help you to enable PG vector extension
and also create a documents table. You
will pass in the content, any metadata
and the embeddings associated with the
content. Remember the model we're using
is 1536 dimensions. So that is what we
are going to pass in here. So command
copy go back to your superbase interface
UI. Now previously you were in the
settings section. Click on this icon and
this will open the SQL editor. When the
editor is open it's going to have a
button for you to create a new command
essentially and just paste what you just
copied in here and then click run. So
once you've run inside the Superbase
editor, it's going to create a table and
it's going to create this extension. And
so you will be able to see a table like
this. Now it you will initially have an
empty table until we finish this
exercise here. But you should see a
documents table here. Once you've
completed this task before you've run
the command, I'll just walk through
what's going on here. Essentially, your
source documents are in this directory.
So we're going to fetch it. We're using
text embedding three small as the model.
Your table name is documents as per what
we stored here. Here is an option. It's
just a toggle I created that allows you
each time you run this script to delete
all the contents of your database. Uh
it's just good for the sake of
demonstration because then each time we
rerun this or you rerun this, you don't
have to think about merging content from
a previous run. If you turn this to
false, then every new run will just add
new content to the documents table
without removing previous content. So
all we're doing here is going into the
path, fetching the text, and what we're
going to do is embed the file content.
So you remember in the previous lessons,
we discussed using the embeddings
function to create embeddings based on
content. So now we're just simply
repeating that exercise except we just
fetch the file content from the text
file. We pass it here as input and then
we store or we are creating objects that
have content in the same object as the
embeddings and metadata which include
the source file name. And so this is
very useful because when you later on
retrieve the relevant documents, you're
not just retrieving the content, you're
also retrieving associated metadata
which you can then use to basically see
where the information came from. So once
we're done creating the embeddings, all
we're doing down here is using the
superbase client from basically just
points to the table and then insert
command and we're just going to insert
all the collected documents and that's
it essentially. So if you go to index
and you've done everything as per your
API keys, all you need to do is click
run. We should, as you can see, as per
the terminal ingest with the source as
text.txt. txt and you can see the
embedding attached. We've uploaded one
document to Superbase table documents.
So then check your Superbase documents
table. Make sure this text is there. You
should see a table that consists of this
text as a content metadata which should
be source text.txt and then the
embeddings. See you in the next lesson.
All right. So based on the previous
lessons, you should have already been
able to create embeddings, understand
what the role of the vex store is, and
so we've taken the documents and we will
visit splitting shortly in another
lesson, but essentially we've taken the
documents and we've embedded them and
stored them in Superbase. So you should
have already done that in the previous
lesson and the document should be in
your documents table in Superbase. Now
in this lesson, we're going to go over
this step, retrieval. The user asks a
question, we convert the user's question
into embeddings. We go into the veto
store. The vector store is going to
perform its similarity calculations to
see which embeddings in the vector store
are most similar to user's query. We
retrieve the relevant portions and then
we pass the prompt, the query the user
provided plus the context of relevant
docs in a final prompt that's sent to
the model to generate an answer. So this
is essentially what we're going to work
on in this lesson. So jumping into the
code, I will just walk through the
changes we've made. We still have the
documents folder in config. We still
have superbase client and OpenAI. We've
added a few things in the utils which
I'll come to shortly. But one of them is
the prompt that's going to be sent to
the model which is going to pass in the
user's question alongside the context.
This context is what we are essentially
trying to retrieve from the vector store
based on the user's question. So
essentially just to take the magic, this
is what we're going to send to the model
to generate an answer. And so our job is
to essentially solve for this equation
because we already know what the query
is and we already know what the rest of
the prompt is. We just need to fetch the
context. And so we have this new logic
called retrieve similar docs that will
essentially help us to retrieve the
relevant documents from the vex store.
But before we do that, please go to
created_tables.sql.
And now I have new logic. Now in the
previous lesson, you went to the
superbase editor to create the documents
table. This time just copy this whole
thing. Go back to the editor and run it
again. What is this doing? Essentially
this is a database function that is
going to help us to bind the relevant
embeddings that are most similar to the
user's query. So you have other
variables here which we'll be able to
pass in later on. We'll be able to say
okay how many results should we return?
What is the threshold? Right? So all
embedding similarities are between 0 and
one where one meaning that the two
embeddings are very similar to each
other and zero means that they're
similar. So maybe you don't want to pass
in too much polluted content into the
models context. So you can adjust your
match threshold. I've set 0.3 as
default. And then here is just an
optional filter that allows you to maybe
filter based on metadata. And this is
what you're going to get back from this
match documents function. Coign
similarity essentially is the
calculation I discussed that is
performed to essentially see which
embeddings as close as possible to the
user's query. Remember the user's query
is going to be converted embedding.
We're going to the vector store to also
see which embeddings are similar to the
user's query. So essentially this is a
database function that once you've
created it, we can then create this
function here. So Superbase allows for
the ability to make remote procedure
calls to database functions. Remember
you created this match documents
database function. So we can make a
remote procedure call here and then we
can pass in the query embedding as we
defined here the match count, match
threshold and filter. In this case, I'm
only going to pass the query embedding
and the match count. And to get the
query embedding, it's just what we've
covered previously. You're just creating
an embedding based on the query passed
in as a parameter in here. So the query
comes in, we generate the embedding
response and then we pass in the
embedding into the remote procedure call
to trigger this database function that
was created. And we've got in the
constants fixed variable just a fetch
five similarity count. Now we have also
defined a new variable called the answer
model. Why? Because in the previous
lessons all we were doing was just
retrieving the relevant documents. We
didn't do anything with them. This time
we're going to take an extra step. So if
you come back here, if you just focus on
retrieving the documents which is
calling the retrieve similar docs based
on this question which is covered in
this document here, this text.txt. If I
click run, what we're going to see in
the terminal is we should see that we
fetch the content and then this is what
we expect. We expect an ID. We expect
the content which will be the entire
document because remember we embedded
the entire text. We did not split the
text into chunks. So there's only one
block of text. And then we've got the
metadata. And this is the cosign
similarity I discussed previously. As
you can see it's saying 50/50 in terms
of how similar it is to the user's
question. And that's because this text
contains both the relevant information
to answer the question which is right
here in 1843 but it also contains a
bunch of other irrelevant information
that has nothing to do with the
question. So that's why the similarity
score is relatively low. So now we've
retrieved the docs. The next step is to
essentially create the prompt. And as I
discussed previously, we are going to
take these relevant docs, access the
content property. That's what this
function is doing. So we get the full
string and then we pass in the context
and the query to construct this final
prompt that the model is going to use.
Remember, every time you make changes,
just control S or command S. Once we've
created the prompt, we are now going to
send the prompt to the model and then
console.log the response from the model.
So now we are going to press run just to
generate the output. And so we're going
to send the entire prompt to the model.
Here we go. In 1843, Adah Love Lace
published notes describing algorithms.
She's often called the first computer
program. So that is essentially coming
from this block here. it's providing an
answer based on what is saw in the text.
And this is very different to just
simply retrieving the relevant docs.
You're now able to go into the vex
store, retrieve relevant information,
pass it as context in the final prompt,
send the prompt to the model, and
generate a final answer. So that is it
in a nutshell. You can play around with
this and when you're ready, I'll see you
in the next lesson where we're going to
discuss how to split this into chunks.
So you can have more relevant chunks of
text, embed those chunks and retrieve
the relevant chunks as opposed to in
this case we just retrieve one big block
of text. But before we do that, let's
have a challenge.
So at this point, you should have
embeddings of documents stored in your
Superbase. And in this lesson, we'll be
doing a short exercise where you're
going to recreate building the retrieval
mechanism we've discussed in previous
lessons. You take a user's query as we
have here. You embed that query using
OpenAI's embeddings function down here.
Then you retrieve the relevant docs from
Superbase calling the RPC match
documents function. And then you return
the relevant docs. When you have the
relevant retrieve documents, we already
automatically create them into a single
string. You create the prompt using the
get rag prompt function from here. And
then you pass in that prompt into
OpenAI's model to generate a final
response. And then you can see the
relevant constants here including the
model the simarity match cal and the
embeddings model all here ready to be
used in the functions provided. So in
the end when you run this function you
should be able to see the final
generated answer based on the user's
question. Now two things for
housekeeping here. Number one, make sure
when you're making changes, you press
command S if you are on Mac or control S
if you're on Windows to save your
changes. And the second thing is make
sure the question can be answers based
on the context in the documents that
you've stored in your Superbase. So for
example, if in superbase you've stored
content from the great fire of London
from the previous text written lesson,
then just keep this. If however you
don't have that and you have something
else or from the prior lesson, just make
sure you change the query so that you
can fetch relevant documents in the veto
store. Okay, with that being said, go
ahead with the exercise and I will come
back to provide a solution shortly. Good
luck.
Hopefully you didn't find that too
challenging. Let us start down here by
first of all getting the embeddings
response. Remember OpenAI provides
embeddings function and all we're doing
here is passing the embeddings model
name and we've got the input which is
the user's query. So done and dusted. We
just need now to access the embeddings
response. I believe that should be data
zero and embedding. So now we should
have the embedding corresponding to
users query down here. You just want to
get the documents and the match error
from calling superbase.
RPC and remember the database function
is called match documents. And then we
want to pass in the parameters. First of
all, you are passing in the users
embedding. So the the embedding of the
query and then the match count how many
relevant docs to return is limited to
five which uh was passed previously.
Okay. So now we have embeddings relevant
docs and now we want to return the
relevant documents similar to the users
query. Now further up here we want to
take the retrieve docs. This is the
context. And now we want to pass in the
query and the retrieve docs to construct
the final prompt that will be used by
the model and then finally construct the
response. So the AI response is going to
be taking in OpenAI responses
create and then we're going to pass in
the model on model which GT4 in this
case and you can change that later on.
The input is the prompt. Further down
here we're just going to console.log
response.output
text and that should be it in a
nutshell. So, AI response. And now,
let's click run and see what we generate
in the terminal. Voila. The Great Fire
of London destroyed some 13,200 homes.
That's exactly what we expect based on
the relevant documents. We can see that
that information is contained here in
this chunk. Okay, fantastic. Well, great
job so far. You've covered going over
embeddings, recap on using the veto,
retrieval, splitting using OpenAI. And
now in the next lesson, we're going to
move over into exploring Versile AI SDK,
a library that will help to greatly
simplify and abstract key logic for
building AIC apps. See you soon.
Okay, so in the previous lesson you have
learned how to essentially construct a
retrieval mechanism that will take a
user's question, create some embeddings
of the question, go into the vex store
and compare the users's embeddings
against what's in vex store, retrieve
relevant documents, create the prompt
and context and send to the model to
generate an answer. Now we want to
explore briefly an intermediate step as
shown previously of text splitting. And
essentially what is this? Now imagine we
have this long document. This is just a
passage from historical information on
the great fire of London in 1666.
Now, if you simply pass everything into
the vex store and embed all of this as
one chunk, what's going to happen is
when you ask a question, as we've seen
in previous situations, the vector store
is literally just going to return this
entire document. And this can be
problematic for various reasons. First
of all, maybe the user's question is
only could be answered by a particular
chunk of paragraph. And so by giving the
model the entire context of the entire
passage, you're essentially increasing
the odds that it misses out on crucial
details that may be required to answer
the question. It could be as simple as
this sentence right here, but the model
might get confused or conflate with all
the other information around that text.
The second challenge with simply just
grabbing the entire block of text is
that your model might not be able to
handle the full context window. As we
know models are limited by context
window of a number of tokens or
characters they can pass in for any
generation. Now obviously in this case
this is relatively smaller passage or
chunk of text but you could have for
example a PDF of hundreds of pages of
hundreds of thousands of character and
you will not be able to pass everything
into the same context window. So in
order to overcome this issue, we need to
find a way to essentially split our
document into various chunks, embed each
chunk and store each of them in separate
rows in the table in Superbase. So the
only adjustment or the main adjustment
I've made here is to introduce this new
function which is a very very basic
simple text splitter. is going to take
the text some sort of chunk size you
provide how many characters to split the
text into and overlap how many
characters should overlap between each
chunk. This is very very basic and only
for demonstration purposes. As a matter
of fact in this course for the most part
we are not going to use text fitting but
it is important for you to understand
how it works and why it's important. So
once we've done the text splitting or
create this utility function, the obser
documents logic has been updated. So if
you scroll further down, you'll see that
instead of just simply fetching the
entire text and and embedding that, we
first of all the document into chunks
and we have stored the chunk size 2,000
characters in an overlap of 100,
similarity of five and a threshold of
0.5. And so when we upsert this
document, we are essentially going to
chunk. Then after we create the chunks,
we are then going to embed each chunk.
So the only difference between this
example and the previous time was
previously we just took the entire
document and we embedded it. And this
time we are taking each chunk and we're
embedding each chunk and we are adding a
row for each chunk. So essentially you
will end up with a table with multiple
rows and each row is representing 200
characters of the block of text and the
associated embeddings. So I'm going to
run this function now. Go to your
index.js
and you can see the question how many
houses were damaged and it's a very
specific question inside corpus of text
which can only be found in one sentence.
So, we need precision. And so, this is a
good use case for splitting. So, I'm
going to click run in the top right.
Let's see what we find. And you can see
we've now embedded, we read 5,988
characters. We split into four chunks.
And the different chunks were embedded
into the vector store. Okay. So, now you
have four rows. You can verify in your
end instead of that one row with one
block of text, you should have different
rows now with each chunk. So I'm going
to unc comment this and let's uncomment
this whole block here. This is just a
repeat of everything we learned in the
retrieval lesson. We're going to go in
the ve store pass in the users query.
But you're going to notice something
different this time. Actually before I
generate this, let me just demonstrate
what I mean. So now we're going to run
again. And notice that we have fetched
more than one block chunk of text.
Right? In the previous lessons, you saw
how it was just one block. But now you
can see it's an array of multiple
objects and each object represents a
chunk of text that was embedded. And
each chunk has a similarity score
assigned to it saying how relevant it is
to the question. And you can see that
based on the query provided, we said how
many houses were damaged during the
great fire of London. And so you can see
that this chunk already has the answer
to the question because the answer to
the question is 13,200 houses. And so
this chunk has already succeeded in
fetching the relevant portion. So
imagine you had thousands and thousands
and thousands and thousands of
characters or words and what the
embeddings approach has done is allowed
you to just quickly retrieve this small
chunk. So we can precisely provide this
context to the model. You can see the
similarity scores all above the match
threshold that was set in the constants
file. So you can see we have an array of
different retrieve docks. Okay. Now we
want to just complete the remaining
steps as we did before. So just
uncomment this and essentially we're
taking creating the prompt with the
chunks of text retrieved which is all of
these text combined together as the
context string. We then essentially pass
in this prompt to the model and generate
response. So let's click on run and see
what happens. So now we're running and
waiting for the response. And according
to the context provided, approximately
13,200 houses were destroyed. And that's
exactly what we expected based on the
information here in this sentence right
here or paragraph rather. So now you've
seen how we've been able to basically
take a a relatively large corpus of
text, split into chunks, embed those
chunks, and now we're able to have more
precise answers to the question. So I
hope this gives a good overview on
dealing with inserting data into your
database as embeddings, retrieving them,
and answering questions. In the next
lesson, we're going to explore the
Versell AI SDK and discuss how this can
streamline and make it much easier to
build AIC apps.
So, as discussed earlier in this course,
we would later on interact with Versell
AI SDK. Now, what is Versel AI SDK? It
is essentially an open-source library by
the Versell company, the same group and
team behind Nex.js JS that essentially
provides a unified interface to interact
with LM providers. You can plug and play
different providers. With the same
interface, you can access very useful
functions that are used for building AI
apps and agents for generating text,
generating embeddings, generating
structured output, generating different
steps that an agent can take, and it
just simplifies the entire process. So
it's a very useful library for us to use
especially in this course. For example,
you have a very simple generate text
interface that you can import from the
SDK. You provide your model and a
prompt. That's it. If I want to swap
OpenAI with cloud or any other model,
it's very easy to do so as opposed to
going through each of their
documentations and finding their own
different styles of generating an AI
response. The same thing for embeddings.
You can simply import embed from Versel
AI SDK and use that unified interface.
It doesn't matter what the model is, you
can still get the same outcome. So now
if we jump into the code, the config is
still the same package.json, we've
installed the AI SDK both here and here.
And then finally, we have imported the
OpenAI model that we're going to use as
our AI model. So the key difference now
between this and previous lessons is you
are not using OpenAI's interface
anymore. You're using the generic
interfaces provided by the AI SDK. For
example, here we want to generate text.
We provided the OpenAI model and the
prompt is write a brief p. So if I run
this, we should see a brief poem that's
been generated. The same applies to
embedding. So let me comment this out
and uncomment this. To generate
embedding, you import the embed function
from AI. You call the openi.ext
embedding model and you just pass in the
embedding model. The value is just
whatever you want to embed. So this text
will be converted to embeddings. I'll
click run again and we see the text has
been converted to embeddings. You will
see we have a full 1536 items
representing the embedding for the text.
So all we've done essentially is taken
the same concepts but instead of hard
coding it to OpenAI's way of doing it,
we have these generic interfaces. And
this is going to be very useful for us
moving forward as we begin to build AI
agents that may potentially want to use
one or more models depending on the
strengths and capabilities each of them
have. So that being said the basics I'll
see you in the next lesson where we dive
deeper into Brazil AI SDK.
All right. In this lesson, we're just
going to do a brief exercise to recap
the basics of using Versell AI SDK. So
the challenge is you are going to
essentially replicate the logic before
we're going to have a twist. So the goal
is to generate text and embeddings using
Versell AI SDK. You are going to
implement the generate text interface in
here. Uh pass it into a prompt to the
model to create a recipe for your
favorite meal. And then you're going to
return the generated text from the model
that is the recipe. Afterwards, you
going to have that value assigned to
text to embed. And text to embed is
going to be passed into the generate
embeddings function which is ready to
take that text. Then we're going to
embed that text using the embed
interface. So that's the objective of
this challenge. I'm going to give you a
couple minutes to do this. go ahead and
once you're done I will provide a
solution.
All right, hopefully didn't find that
too challenging. So let's kick this off.
So the first thing here if you remember
from the previous lesson is we want to
create an interface using the Versel AI
SDK generate text.
So we're going to have the text
extracted from the weight generate text
and inside here we are going to pass in
the model which is the AI model already
up here. So GT40 and then the prompt. So
uh create a recipe for making pizza. So
whatever your favorite dish is the idea
here is to ensure that it's it's
whatever your favorite. So say pepperoni
pizza. Okay, so now we're going to send
this off to the model. It's going to
give us back the text and I'm going to
console.log and say generated text which
we'll pass in here and then finally
return the text. Okay, so that way you
can uncomment this and now text to embed
should be the value of the text. So
that's phase one and then phase two is
use the embed interface to generate
embedding. So we will already have the
text to embed passed in as a parameter
into this function. So all we have to do
is extract the embedding from weight
embed the model which is openi.ext
embedding model and then you're going to
pass in the embedding model. So this is
the embedding model name value is going
to be the text to embed. So this is what
you want to embed which we've passed in
which is the generated recipe of your
favorite dish. And then here we're just
going to console.log the embedding
generated. Okay. So now we're going to
run and see what happens. We can see
generated texture. Here's a simple
recipe for making delicious homemade
pizza. Pizza dough. Teaspoons.
Sugar. Olive oil. Salt. pizza toppings
and so on and it's got instructions. We
take all of this text and we generate
the embeddings from the text and that's
it in a nutshell. So now you've got the
basics of how to use resell AI SDK to
generate text and generate embeddings.
In the previous lesson we discussed
briefly the basics of Versell AI SDK to
generate text and generate embeddings.
But one of the more useful use cases
especially in the cases of building
aentic AI apps is the ability to
generate structured output. So given the
model instructions to give you back
structured output. So then you can use
that for something else. Maybe pass into
another function that will call an API
retrieve data or pass into another chain
of the model so it can do something
else. Now, one of the useful interfaces
that Visel AI SDK provides to do this is
the generate object interface. This is
an interface that allows us to instruct
the model to generate structured output
in JSON format based on a predefined
schema. When we mean schema, we just
simply mean you're defining the data
types, the properties you want, and the
data types. Do you want a string, array,
objects, and so on and so forth. And so
for example down here we have a function
called basic structured output using
this generate object interface to
essentially instruct the model to
provide a structure JSON output of a
recipe for pizza with the schema of
object that contains name ingredients
and the steps for creating the recipe
and each property is defined by data
type. a string. Ingredients is an array
of objects with the name and amount and
so on and so forth. And we're using this
zed. Zed is imported from a library
called zod. Zod is a popular library
used to define schemas used to define a
structure interfaces. So we're just
simply accessing and using zod to help
us to define the schema. So z.object
simply means object. You're creating a
schema. You're basically saying I want
an object. Z.AR means you want an array
and Z.RAY with Z.Object inside means an
array of objects with name and property
as strings. So when we tell the model
generate a pizza recipe, we expect back
a structured JSON object that contains
name, ingredients, and steps. So let's
run this and see what happens. Okay, so
we're now running this. We've sent the
instruction to the model. Please
generate pizza recipe and give us back
structured output. And here we go. The
name which is a string which is what we
expected. The ingredients which is an
array of objects with name and object as
strings exactly what we expected. And
steps which is an array of strings. And
that's literally how you can instruct
the model to give you back a JSON in a
structured output. You can see how this
can be very useful to then instruct or
pass on to another step in your program
to invoke something else and create a
chain of prompts or a chain of actions
that's more agentic by nature. Another
use case for structured outputs is
classification. Now in this example, we
gave a plain instruction and left the
model to exercise some form of
creativity. But classification is
usually binary. So in this case, we want
the model to classify the customer
review using either positive or
negative. So if you pay close attention,
we're using a type enum. Enom is just a
special data type that enables a
variable to be predefined constants. So
you can have more than one or two or
even three or more enoms. But in this
case, we want the model to give us a
schema. Provide a JSON that contains the
reason for why it's deciding positive or
negative and whether the type is
positive or negative. We also provide
describe. Describe is just a useful
property added to the ZOD schema to just
help the model guide the model to know
what you're looking for. So here I'm
just further emphasizing that this is
the sentiment of the customer review.
And so again, we use the generate object
interface and provide the model schema
name, the description of your schema,
the shape of your schema, what
properties you expect back in the types
of those properties, and finally the
prompt. Let's run this and see what
happens. Okay, so here we go. Reasoning
the statement indicates satisfaction as
the app performed as anticipated and
type positive. And so we were able to
classify the text provided by giving the
model instructions using the generate
object interface to give us a structured
output and providing the enoms of what
we expect. And so you can start to see
how this can be very useful for
classifying customer reviews,
classifying customer requests, customer
service. The structured outputs gives us
the ability to provide more structure so
that we can take and manipulate data
types to do something down the line. So,
in the next lesson, we're going to go
over a brief exercise so that you can
put this into action and begin to put
this in memory as well.
So, in the previous lesson, you learned
how to create basic structured outputs
using Versel AI SDK. In this lesson,
we're going to do a brief exercise just
to get you used to how the interface of
generate object works and how Zod works.
So, the first goal is to build a Zod
schema. You're going to place a simple
sandwich order. All the instructions are
down here, including what each property
should be and what the data type of each
of them should should be as well and
steps of being provided. Once you've
completed that for this function, you
want to uncomment here and run. After
you've run, the next exercise is to
build a Zod schema for classifying a
short message. Again, all the
instructions have been provided here.
how to classify a short user's message
based on what's being sent, what the
message is. So this is the message that
we are using and you will complete this
as well. So I'm going to give you a
couple minutes to work on this and once
you're done I will provide a solution.
Good luck.
So hopefully that wasn't too difficult
but we will go through everything slowly
right now. So let's start off with the
sandwich order. As you can see per the
instructions, you can see the different
data types as expected for the required
fields to fill into Z.Object which you
will then pass in as the sandwich
schema. So let us begin. The first thing
is the size. So we want size and we want
an enum. Remember an enome is a
constrained data type. So we can just
pass in small, medium and large. And you
can also provide a description just to
help the model as well. Overall size of
the sandwich. The next is bread. So as
we said up here, bread is a string. So
that's pretty straightforward. Z string
and describe. This is the type of bread.
Example, wheat, white, etc. The next
data type is toasted. That's a boolean
type. So toasted Z.Boolean describe and
this is whether the sandwich will be
toasted. We have toppings array of
strings. So the data type has been
provided to you here already. So you
just pass in toppings Z array which is
an array of strings and then minimum one
at least one provided here. I think this
is complaining because we didn't pass
comma there. And describe one or more
toppings like tomato, potato, lettuce,
pickles, and finally the notes. So it's
just an optional string. So you're going
to do Zstring. Then you're going to pass
in optional. So you're just letting know
it's optional. and optional free text
notes like cut in half. And so the
prompt is here as we have provided
previously for you. We have the sandwich
order, a simple sandwich order, the
schema has been provided and the
console.log.
So we're going to scroll up here,
uncomment this and run and see what
happens. Okay, so here's the structured
output returned by the model. We can see
the size is small. The bread is
sourdough as we have the different
choices here. Okay, so small sourdough
bread toasted is a boolean says true.
Toppings an array of strings exactly
what we wanted and notes which is cut in
half. So that is the basic structured
output. Moving on to the next exercise
about classifying the short user
message. and to complete the message
class schema that has been provided to
you here. So let's start off with
reasoning. So reasoning is a string and
let's say describe and this is a brief
explanation for why this label fits or
however you wanted to describe that. So
classify users message. We're providing
a reasoning and then we're going to
provide the label. The label is an also
an enom. Let's put the comma here. So
stop complaining. And this enom is going
to take a compliment plaint.
So we've got the reasoning and then
we've got the label and let's also
provide describe so that the model has a
better description of what we want to
achieve here. So highle category of the
message. So this is the message schema
and then you got the message here. You
could also have a group of messages. So
you could have a loop for basically an
array of messages and then for each
message you can loop over and generate
the object for that particular message.
So now we have the generate object. Now
classify the user's message below. It's
a promise sent to the model. And now we
uncomment classification structured
output exercise. comment the basic
structured output exercise. And now
we're going to go up and run and we
should see the label is complaint. The
reason is the message expresses
dissatisfaction with the app due to
technical issue. So there we go. So
you've been able to implement two types
of use cases for structured outputs. one
where you're able to essentially provide
a prompt to the model to construct a
JSON of the data types you want for
multiple properties of multiple data
types and the second type which is more
focused on constrain classification. So
I hope this was useful keep practicing
and I'll see you in the next lesson.
So in the previous lessons we've covered
the basics of vers AI SDK generating
text generating embeddings we covered
structured outputs and how you can
effectively provide an interface that
allows us to instruct the model to
provide a JSON in the structure that we
want. But where is this all pointing to?
Where does this all go to? Well, it goes
to eventually building agents that are
able to execute what's known as tool
calls. Now what's a tool and what's tool
calling? If you've never used function
calling before, essentially tools are
just actions that the model can invoke.
So we take an instruction, we give it to
the model, the model provides some sort
of structured output and then we take
that structured output to invoke a
function. We pass that in as a parameter
of some sorts and then we're able to
invoke another function. And so the
results of these actions can then be
reported back to the model and then the
model can then generate a final
response. So, how does this work in a
nutshell? Let's use this basic weather
example to illustrate what's going on.
Remember, we've already covered using
the generate interface in the previous
lesson for structured outputs. And so,
what we want to do now is use a generate
text interface provided by Versel AI
SDK. And what this expects is model,
tools, and a prompt. Now, we've already
covered schemas and defining schemas,
constructing scheas in the previous
lesson. And so essentially we provide
the model which is already defined here.
Tools are the functions that you want to
invoke based on the structured outputs
the model generates. Remember in the
previous lesson we learned how to
provide a schema to the model and how
that schema would be generated as JSON.
Well, all of this logic is just
encapsulating defining the schema in
this tool. So this tool will contain
description. You want to get the weather
for a location. The schema that you want
the model to enforce. So you want the
model to the location as a string in an
object. And we have this execute
function. This execute function is then
called passing in the location into the
parameter to invoke this function. The
results of this function will then be
displayed. Okay, I'm going to run this
because I know it might not be too clear
right now, but hopefully this example
will make it make sense. So remember,
all a tool is is a function or an action
that the model can invoke. It is
essentially a way of taking structured
outputs generated by the model and then
extracting that to then pass in as a
parameter as we've done here. So we can
invoke a function. So we have the tool
call here and the tool call has an ID
and the tool name is weather. Okay, we
are trying to get the weather of a
location. The users provided a prompt.
What is the weather in New York? The
model sees this prompt and ex creates a
structured output with New York as
location. location is then passed in to
the execute function and this is
returned location temperature with a
random number between 50 and another
value. And so we can see here this is
the tool call which has the input as the
location New York because we provided a
schema as we learned in the previous
lesson and now we have the tool result.
This is the result of calling this
function. So all of this is happening
under the hood. First the model
generates the schema, then it's passed
in as a parameter. The functions invoked
and that's why we have the output as
location and temperature is this random
value. Okay, so that is the difference
between a tool call which is more of
just a structured input and the tool
result which includes the result of
invoking the function. So you can
imagine that this could be an API call
to your database or any other place. The
key point here is the parameter used to
invoke the function was a structured
output generated by the model through
the prompt that you provided. Okay. Now
what if you want to provide multiple
tools? Sometimes to get the results we
want it's not sufficient to just provide
one tool. So let's say in this case we
want a situation where we have two
tools. We want to first get the weather
in a location and then get the tourist
attractions in the location. Now all of
this should be familiar to you because
the schema is defined in a way familiar
from the structured outputs lesson. We
want to tell the model that we want an
object that contains city as a string.
We also want to execute this function
with the parameter passed in the input
schema passed in the city extracted and
we want the city and then we want to
return these attractions. These are
hard-coded values. And so if I go
further up here, let's run. And when I
ask the question, what is the weather in
New York? And what are the best
attractions to visit? The model is going
to see this. First of all, the first
tool, which is the weather tool, is
going to see schema and extract New
York's location, pass that in here, and
return this value. the second one which
is best attractions to visit. The
model's going to see that and realize it
needs to invoke the city attractions
tool. Looking at the descriptions, it's
going to extract New York as the city
and it's going to return New York and
the attractions. Let's see what happens.
Okay, this time instead of one tool call
and one tool result, we see tool call
one which has the location New York and
tool call 2 which has city New York and
we also see the tour results for both of
them. First we have the output location
New York with temperature 43 and the
second tour result is going to be the
city attractions tour result which is
the New York and the values here. So you
see how we've gone from generating
structured outputs to now being able to
take those generated structured outputs
and be able to use them to invoke
functions so we can get back data. Now
of course all of this would not be
complete without a way to generate the
final response from the model. Remember
the whole point behind retrieval
augmented generation is the ability to
fetch context outside of the model's
training data so we can pass it back to
the model in the prompt and generate a
more context relevant answer to the
user's query. So the whole point behind
this is not just to be left alone with
this tool calls and these tool results.
Perhaps like I said you've executed
called an API you've retrieved the
relevant information but now we need to
pass that relevant information to the
model so we can generate a final answer
and so this is where this last step
comes in and versel AI SDK provides
these useful interfaces as we've seen
previously the generate text the tool
interface and so on and now we are able
to use what's called the stop when so As
it says here, by default, tool calls
will just return the results of
executing the function, which is nice,
but we need to take that to the model to
summarize the tool results. What the
stop when property does is it basically
tells the model to loop over itself
again and take x number of steps beyond
just generating the tool results. So
after the tool results are returned, we
can then tell the stop when to basically
take another step to provide the tool
results to the model and that would tell
it to generate the results. So the way
to think about stop when is just how
many times to invoke the model in the
process that we're currently running. So
here we have three steps. So step one it
will invoke the tool results here.
Create the structured outputs here and
and then the execute will be called
which we've seen. This is the second
step. And then the third step that we
want is for the model to summarize the
results from the data that's been
returned from invoking these two tools.
So that's why the number three has been
passed here. And then we have other
logic to extract that. So let me run
this. And you can see we've essentially
extracted the tool results and then the
final result generated by the model is
what we can see here. So let's scroll
back. We can see we've logged the tool
calls and as we expected locations New
York, cities New York. The model has
basically seen the results of the tool
results and aggregators summarized what
it's seen in a more human way. current
temperature is 72 degrees and it
provides the tourist attraction. So this
is not too different from what we've
learned with embedding storing
embeddings retrieving relevant documents
passing as context the model generates
an answer. The difference this time is
that by using tools we can provide
custom functions. These functions can
also call the vector store to retrieve
embeddings of relevant documents or it
can call an API. It can call your
database. It just gives a lot more
flexibility for where you're retrieving
context to pass into the model. And you
can see how this allows you to build
more powerful complex and multi-steps
Gentic application. So in the next
lesson, we're going to go over some
exercises to help you have a better
understanding of how tool calling works.
In the previous lesson, you learned the
basics of using the Versel AI for tool
calling. As I mentioned before, tool
calling is essentially taking structured
output from the model and invoking a
function that you define based on the
inputs constructed by the structured
outputs. Now in this here in this
exercise, you are going to practice
this. We have the goal here to implement
a single tool which is essentially
fetching grocery item from this
in-memory data. We're trying to mimic
what would happen say if you had fetch
data from the database to get the price
and also implementing a situation where
you have more than one tool. So delivery
ETA as well and using the stop when in
the third case. So for each of these you
have all the comments. So you would
start with number one, uncomment, run
it, number two, read the instructions,
follow the todos, and number three as
well. So you should be able to go
through all three of them. It should
take you a couple of minutes to get
through them. Make sure to command S or
control S to save your changes and then
click run to run and see what happens.
So best of luck. Take a couple minutes,
do this, and I'll be back with the
solution shortly.
Now, let's begin with the first one. The
first exercise here, the first challenge
is create a price lookup table that will
take item string and return item price
using the price table. So the first step
is to define the tool with the Z schema.
And so we already have the tool here.
Now to define the schema in here. So
what we have to do is Z.object is
already open. So you're just going to
pass in the string type into this input
schema. Remember we want the model to
return an object of item and the item
being whatever it is we want to search
in this case milk. So we want milk to be
passed in here. The next thing that we
want to do is to go ahead and implement
remaining portion here. So this is the
execute function. So all you have to do
is uncomment this section here. We're
passing in the item in and then we are
going to return this function here. So
it should get the item and then price
which is then going to be from the price
table. We're going to pass in the item
passed in to lowerase just to make sure
and then fall back to null if we were
not able to find it in there. So now
we've got this price lookup here with
the tool execute and then we pass in the
item as a parameter and then we return
this item and the price from the item.
And so now we can pass in price lookup
into tools. So we just need to uncomment
this. And price lookup is being passed
in as a tool. And so when how much does
milk cost, it's going to extract milk as
the item invoke the function return the
tool result which will be what is
returned by this function. With that
being said, let us uncomment the first
one. Save and run. Okay. So as expected
we've got the tool call which has the
item as milk is the input into
subsequent function. As we discussed the
model was able to extract the item as
milk as per the schema defined in the
tool result which is now the price
lookup is going to have the output which
now includes the milk item and the price
for milk which is 1.59 and that
corresponds to what we have here. Okay,
so that's the case where we have one
tool call. Let's move over to the case
where we have two tool calls and we want
to introduce a delivery ETA that takes
address string and returns a pretend
estimated time of arrival. So here are
all the instructions again and we're
just going to follow the sequence. So we
have the tool interface description, we
want an item in string as we have and
here's the solution. And hopefully you
didn't see this before the previous one.
But now we've got this delivery ETA
which is the estimated delivery time. So
we need to pass in the address z string.
That's the address is going to be a
string. This is what we want the model
to enforce. This is the JSON we want
back. Inside here we're just going to
pass in the address. And then ETA is
going to be a random number as assigned
by this value. And then here just remove
this and pass in the address. So we
should get back from this function an
address and estimated ETA minutes which
is the random number generated here. And
then based on the question we should see
extraction of milk and we should also
see an address extraction as well which
should be returned back in this delivery
ETA function. So if we run this make
sure we first uncomment we should see
the tool calls and tool results. So
let's see what the model has called. The
first item it extracted was milk. Okay
we expect milk to be extracted because
we defined item and so it's seen the
milk and it's fetched tool call for the
milk. It also made a tool call for bread
because we also put bread inside the
prompt. And finally a tool call for the
address which was extracted from the
prompt. Now we have the tool results
milk and the price 1.59. We also expect
to see bread and the price which is
2.49. And finally the address and
estimated minutes which was calculated
by the tool function further down. So, I
hope you're getting the hang of this.
Essentially, all we're doing is using
prompt sending to the model, give the
model a schema so it can construct a
structured output, take the inputs of
the structured output, pass that in as a
parameter into another function, invoke
the function with those parameters, and
then return the results back to the
model to generate final answer, which is
the third and final exercise we're going
to do right now. Because as I said
before, having the tool results is
extremely useful. But if you want to
complete the process, we want the model
to essentially summarize the results
like we would do with rack. You might
not just want to retrieve the relevant
documents from the back store. You also
want to synthesize that with the prompt.
So you generate a final answer. And so
now we move over to the last challenge.
We want the module to call the tools,
receive the results, and then summarize
them in the second turn using the stop
when function. So, we've got the price
lookup tool, the delivery ETA tool,
which is the as we covered, pass in the
address, we're passing the item, and now
we want to enable step count to count
number of steps. Remember the model was
used to generate the tool calls once
twice and now we want a third step where
the model generates the final results
using what the tool results have been.
So models invoked once step one models
invoked twice step two and model is
going to be invoked a third time. That
is why we are defining here as stop when
the step count is three. The prompt is I
want to buy eggs and banana. Use tools
to check prices and tell me total cost
and estimate delivery time to 221
Baker Street. So we have the tools being
defined, the delivery ETA and the
results also being provided as well. So
we are going to call this and see what
the results are. And there we go. Final
summary. The total cost for eggs is
$3.29.
As we can see in the dummy database,
banana is 0.59. So the total cost is
3.88.
The estimated delivery time to the
address is 26 minutes. And this was
exactly what the user had asked for
based on the prompt. I hope you found
this useful. You can see how we've gone
from a basic prompt to generating
embeddings to the retrieval from the
veto store to then exploring the ability
to utilize tools to essentially extract
information from invoking functions. And
you can see how this is leading to
multi-steps where a model can be
involved in multiple steps to get to a
final outcome. sometimes involving one
or more functions based on inputs
provided by the model and then the model
at the end helping to summarize the
information. So I'd recommend you just
keep practicing so you get a good feel
for this and once you're more
comfortable you can go to the next
lesson where we're going to start
piecing this together in building the
final customer support AI agent.
So in previous lessons you've learned
about how to use the cell AI SDK to
create structured outputs, generate
embeddings, generate text, and also
introduce tool coding. Now we're going
to move closer towards the final project
of building a customer AI agent. The
first thing we're going to try to do is
create this basic agent routing
architecture. As you can see on the
screen, unlike the usual route retrieval
which we covered earlier in this course
where you embed the documents and then
the user asks a question or retrieve the
relevant docs and answer the question
which would be this typical path going
this way. We are now going to introduce
two branches. So it's no longer one
direction. We're going to have binary.
We are now going to give the model the
opportunity to classify if the user's
question requires a retrieval. So if it
does require retrieval, we will continue
to do what you've learned so far in the
early parts of the course with
retrieval. And if it doesn't require
retrieval, we'll just answer the
question directly using the model's
trained data, whatever the model's been
trained on. So let's go into the code.
Now what I've done here is introduced a
couple of things. The first thing I've
introduced is replacing all the
primitive OpenAI library functions with
Versel AI SDK. So when we ingest or
upsert the documents, we are using the
embed and the structure vers AI SDK
expects. The same thing with retrieval
as well. Again we embed using embed from
versi SDK instead of using the openi
primitive because when we export from
config we are using the create openi
client from the SDK not from the openi
library. So these are the key changes
which of course lead us to using
generate text which is the interface by
versel AI SDK. We've also introduced
some new variables in the constants
file. For example, the classification
model is what model we're going to use
for classification to decide whether to
perform retrieval or not to perform
retrieval. So in this case, I'm using
the same model for both. But you're
going to find a lot of use cases where
you might want to use different models
for each of them depending on the
strength of the model. In here we've got
knowledgebased description which I'll
come to in a second once I finish the
quick tour. We've got all the different
functions I'm going to showcase here
which I'm going to uncomment one by one.
And then we've got a prompts file. Now
this prompts file contains all the
prompts used in this example from the
prompt used to retrieve to the prompt
used to classify and so on and so forth.
I'll come back to this in a second. The
other introduction here is the docs. Now
in the docs folder I have extracted the
markdown various pages that scriber has
in it help desk. So it has multiple web
pages for providing FAQ and answers to
usual support questions. So what I've
done is I've scraped all those web pages
converted them into markdown which is
friendly for models and embeddings. And
so when we run this ingest documents
function, we're really just looping
through each of these, extracting the
text and embedding each of them. And
then for each of the embeddings and the
text, they will get inserted into the
database. So I'm going to embed and
insert into my database a snippets of
some of Scribbers's help documents. And
once I've done that, we are then going
to perform a basic retrieval and then a
classify and retrieve. This is kind of
more agentic as per the diagram I showed
you earlier. Okay, so let's start with
ingest documents. In the previous
retrieval lesson, you learned how to do
this without Versel AI SDK. As I said,
the only difference is I've replaced
with the embed interface. And there is
no splitting and chunking here because
we're just looping through each file.
There's no need because each of them do
not contain anything substantial per se.
So we're going to clear all the contents
of your database and then for each file
here we're going to loop over them embed
and then insert and then have a metadata
property which is going to have the
source as the file name. So this is
going to be very useful down the line
when you want to uh inspect the source
of where the chunk is coming from or you
want to display in a UI and then we're
going to insert all of them. So we're
going to run this. Let's go back to
index.js.
So here we go. Inesting documents from
the directory. We loop over each of
them. Read in the characters of each of
them. And we've successfully uploaded
nine documents. Two superbase table
documents. Okay. As per what we did in
the retrieval lesson. And this is what
one of the objects looks like. You can
see the metadata points to the actual
document. Then of course we have an
embeddings property, the text and the
metadata for each of these files and
that's all being put into the database.
So we're familiar with this. Comment
this out. Let's jump over to basic
retrieval. Now I've got a question here.
How do I access Scrimmer Discord? Now
what we expect to happen is for this
function to kick off. First of all,
retrieve similar docs is going to take
the query, embed the query, and then
send the query to superbase remote
procedure call through the match
documents database function which we
created earlier on in this course to
retrieve the relevant docs based on the
query. So that's what we expect to come
back. So when we get this back, we're
going to have these documents, the
relevant documents. We're going to pass
it to this combine documents function
which is essentially just going to
extract the text and combine all the
text together. We create the rag prompt
which you saw in the previous retrieval
lesson essentially saying below is the
context answer the question based on the
context provided and then we're going to
generate the text again using the Vel AI
SDK interface and console.log the
generated text. Now, if I go here, we've
got one support document about linking
your Scrimmer account to Discord and
another one in how to join Discord. So,
the expected behavior is that we are
able to retrieve the relevant docs from
the Vex store to answer the question and
then the model is able to provide a
relevant answer to the user. So, let's
run again. There you go. So to access
Discord use this link and that was in
alignment with what we saw here as well.
So it was able as you can see look at
the relevant doc. So we've got
similarity score 0.48 for this chunk
which is the can I download the code in
a scrim file. So it was able to pick up
on that. It also picked up on this chunk
here linking your scrim account to
Discord which I showed you as well. And
all this content was passed as context.
And finally, how do I join Scrimma
Discord, which had the highest
similarity score. So if you remember in
the first uh retrieval lesson where we
covered match thresholds, you can
actually set a threshold at the point of
retrieval, there's another property here
that will allow you to set a minimum
threshold. So you could say only
retrieve documents that have a
similarity score greater than or equal
to a certain number. That way maybe we
would cut off this one and only focus on
the top two. But I left everything here
for you to see that what we expect to
happen is exactly what's happening. So
if we go back now we want to go to the
last section. This is the classify and
retrieve function. So what's going on
here? This is a bit more involved than
what you saw previously. So I'm going to
take this slowly. So essentially the
main step is that rather than just send
the query directly to do the embeddings
in the retrieval, we want the model to
exercise some form of agency. We want to
classify the prompt first. So we want to
give the model the prompt and then let
it classify whether or not it's a
general question. If it's a general
question, we let the model just answer
the question directly using its own
training. Else if it's a retrieval, then
let it continue with the process. Embed
the query and call the remote procedure
call to get the relevant docs and so on
and so forth. I've also introduced
what's here as a fallback. This is a
situation where for example, we don't
actually retrieve any relevant
documents. So we will just send the text
to the model to just generate an answer
directly. So provided that we do get
relevant documents back, we will
continue to go through the process,
combine the documents, get the rag
prompt and so on and so forth. Something
else that's pretty interesting here that
I've introduced is we will also map over
the retrieve sources and then give them
a type as well as where the source is
coming from. So this would put more
structure into the retrieve doc. So
you're going to get the answer and the
sources from where the answer was
generated from. If we essentially are
run into some sort of error again, we
will just default to generating a
general answer. Okay, so that is the
classification in a nutshell. Let's go a
bit deeper into looking at the prompts
that's kind of driving this behavior.
First of all, the classification
prompts. This generates a prompt for
classifying the retrieval or general. We
pass in the question and knowledgebased
description which is in the constants
here. Scrimber an online platform for
learning to code. We are going to pass
that in here. And this is the prompt.
Classify the user's question. Goal is to
use a customer's support knowledgeable
base about scrimber. If the question is
about scrimber itself or it's cod and
technical and and so on. If the question
is clearly off topic, respond general.
Otherwise, respond retrieval. And now we
send that question here. Classification
is an open colon. And we send that off
to the model. So this is the prompt to
classify. As I was saying previously,
you could actually put this in the
context of a structured output
instruction to the model. So it's an
enom. It returns either general
retrieval, but I'm keeping it simple
here for the sake of your understanding.
Then we've got get general prompt. So
it's just a direct prompt. Answer the
following question concisely and the
question is passed in. And we've got
fallback. And we've got the rag prompt
which you're familiar with your helpful
assistant. Answer the user's question
based on provided context and so on and
so forth. So pretty straightforward. But
all we're doing is classifying the
prompt, generating a retrieval prompt or
a general prompt and sending that to the
model to provide a final answer. A few
other things to note here is the use of
max output tokens and temperature. Here
we're just limiting the amount of text
returned. We really want to force the
model to essentially say response or
general. And so 20 tokens is enough to
do that. Temperature controls the
randomness of the output. How much
creativity you want the model to have.
In this case, I want zero because I just
want retrieval or just give me general.
I don't want the model to express any
creativity whatsoever. So that is the
reason for the introduction of these two
properties and their values. And with
that being said, let's jump back to the
main section. And I'm asking the same
question as well. And the behavior I
expect to see is a log for retrieval.
And so that that's exactly what
happened. How do I access Scrimbo
Discord? We classify the question. The
questions been classified as retrieval.
We perform retrieval. We generate the
embeddings. We retrieve the chunks and
we generate the rag answer. And that's
the answer that we get. Now you can see
the retrieve docs here. These are the
source documents. So you can see ID,
content, metadata, similarity score and
type which was all defined in the
agentic retrieval file. Now let's say I
completely change topic. What is the
capital
of France? Okay. Now what we expect is
different behavior where the
classification should be general. So
let's run this. And now we're answering
a general question and the generated
answer is Paris. The retrieve docs is
null because we never went to the vector
store. So you can see how we've gone
from asking a query. We go we perform
retrieval and it was this kind of
one-dimensional approach to now we've
introduced more agency where we're
giving the model the opportunity to
classify queries and now it can route in
different directions based on what the
query is. So in the next lesson, we're
going to go deeper into this so you can
see how this routing can build up to the
final customer support AI agent.
The previous lesson you learned how to
route the agent's outcomes in different
directions either towards retrieval or
to answer the question generally. In
this lesson, we're going to do a brief
challenge where you are going to
refactor the logic for the agentic
retrieval to introduce structured
outputs which you learned a couple of
lessons ago using Verscell AI SDK. So
we're going to go to that file where the
agentic retrieval logic is and follow
the instructions to use the generate
object interface to create structured
outputs so that we can classify whether
the question or the query as you can see
here is retrieval or general. So what
you're going to do is hop over to aantic
retrieval and we have the classification
prompt here already done decision
already assigned to classification which
is extracted from the object here and
all you have to do is fill in the gaps
here construct the schema property
utilize zed everything you've learned
including the enom types from the
structured outputs lesson and then also
add a property for the prompt sent to
the model. Once you've done that, make
sure to control S, command S, save your
changes and then run. So, I'll give you
a couple minutes to do that and then
I'll come back with the solution. Good
luck.
So, let's work through this slowly. We
know we've got the model already, schema
name, schema description. So, the next
natural thing that we need to cover here
is just the schema. And so, remember,
we're using Zod. So we need to construct
the object that we expect to get back.
The first thing I usually recommend is
you pass in the reasoning. So you can
see what reason the model wants to give.
And let's describe and say we want a
brief reasoning for the classification
choice. Why is it going to choose
retrieval or general? The next thing we
want to do is define the type and zenom.
And now we're going to pass in
retrieval. And we are also going to pass
in general. So let's go here. Let's pass
in general. So we've got retrieval and
general as the enom types. And then we
can also add a describe again to help
the model where we are going to say is
the question general or does it require
knowledge
base retrieval? And finally number two
add a property for the prompt sent the
model. We will just go down here and we
already have the classification prompt
here which we've already constructed and
put a comma here and voila. So I think
that should be that. So we've already
constructed all of this here. If we go
back to index that's been imported ready
to go. Click run and we should classify
the question as retrieval. As you can
see the reason here is the question is
about scrimmas discord which is directly
related to scrimmers platform and user
support. We've got the type retrieval
classified and we generate the answer as
expected. So practice this a couple of
times and you're ready. I'll see you in
the next lesson.
In previous lessons, you learned how to
use the OpenAI SDK to generate responses
based on a prompt provided. This was
using the in-built OpenAI responses API
and Versel AI SDK was able to wrap
around that and provide a simple
abstraction over that functionality.
But aside from just generating text, the
responses API also allows you to file
search or web search or computer use
interact with different web pages. you
would essentially be able to invoke that
tool directly using OpenAI's inbuilt
functionality and get the latest results
from a web search and then you can pass
the results on to the model. So you can
see an example of return data from the
web search tool. Let's dive into the
code. As you can see, we've imported
OpenAI from Versel AI SDK. We have the
generate text interface for generating
text as we have covered previously and
then we've got the model and the
question what is the latest open large
language model and as of today it is
GPT5 gpt5 is the latest large language
model by openai first of all we're using
openi door responses and then we're
passing in the model and secondly and
most importantly we are passing in tools
with web search preview and
openi.tools.web
search preview with this brackets in
here. So there's nothing passed inside.
So this is just the syntax and what
happens under the hood is visel AI SDK
will then interact directly with the
openi APIs in order to invoke the tools
and then the model's going to generate
an answer based on the tool results. So
essentially this is an abstraction over
a lot of the functionality we've
discussed previously with tool calling
and returning the tool results and
having the model generate the answer.
It's just been abstracted and
simplified. So if we click run all right
so we can see the text as of September
2025 open latest large language model is
GT5
and on and on and on. So this is the
text generated as a result of context
from the web search passed onto the
model. If we scroll further down and
look at sources, you'll see what was
fetched from the web. So you can see the
sources type, URL, id title and so the
contents of these different URL were
scraped passed as context and the model
saw the scrape contents of these web
pages and also saw what the question was
and based on the question generated a
final answer. So this is essentially the
same mechanism as we saw in retrieval.
we're still performing a sim similar
behavior. We're going somewhere. We're
retrieving context. We're passing that
context back to the final prompt that's
sent to the model. And so this is
essentially how we can incorporate a
tool for web search. And you can see how
we can build on this and augment
pre-existing
agents with multiple tools and add one
of those tools as web search. And so
we're going to see how this works in the
next lesson.
So in the previous lesson you learned
about using the web search tool provided
by Vel AI SDK and OpenAI under the hood.
It's the open air responses API being
used as a tool to retrieve information
from the web in real time. Now, in this
lesson, we're going to look at how to
combine the entire mechanism about
retrieval, which you learn going to the
vex store, retrieving relevant
information to answer the question, as
well as the web search tool. So, the
agent is able to decide which one to use
based on the question. Now, what is new?
First of all, we have taken the entire
logic for retrieving and instead of
having a standalone file where we run
the retrieval, we've converted it into a
tool. Remember, a tool is essentially an
interface provided by VEL AI SDK that
allows us to define what exactly we want
to invoke based on structured output
returned by the model. Here we will
provide a tool that is going to retrieve
relevant information about Scriber based
on the user's query. And so that query
is then going to be passed in to do
everything else you've learned in this
course. We embed the query, then we
query the Superbase database, and then
we return the retrieve docs. So
essentially everything you learned about
retrieval is now wrapped inside this
async function in execute in this tool
interface. Now if we go to web search
retrieval agent we have two tools. The
first tool is the knowledgebased tool.
So we've converted the entire veto
retrieval mechanism into a tool. We also
have the web search tool which we
covered in the previous lesson. And then
we pass in the tools. And then we've got
stop when which was also covered in
previous lessons with the max stop steps
of three. So the model is going to
iterate three times to generate the tool
results and finally pass in the tool
results into context and then generate a
final response. We've also introduced
this new prompt function get retrieval
web search prompt. Here we pass in the
knowledge base which is Scrimber, an
online platform for learning to code.
And if you look at the prompt here,
essentially just saying you're a helpful
assistant and your primary goal is to
answer the user's question accurately.
And we just provide conditions. Use
knowledge base search if this. Use web
search if you need real-time
information. And then answer directly if
you already know the answer. And so this
is just a prompt to encourage the model
to return the appropriate tool based on
what is required. All right. Further
down we just have different logic to
extract the sources. If it's a web
search based tool call extract the web
title URL. But if it's knowledge based,
if we went to retrieval step, then we're
going to access the steps property
inside results, which is the steps that
were taken by the model invoking each
tool results. And then we're going to
check if it's knowledgebased search. If
it is, then we're going to do some extra
extraction and eventually we're going to
get the retrieve documents which was
passed in through the knowledgebased
tool. This is what was returned here.
And so once we get the retrieved
documents, we're just going to extract
all the stuff that we had embedded and
insert into into the database. So the
content, the metadata and similarity
score, which can then be used later on
for filtering or removing retrieve docs
that are below a certain threshold. Once
we're done, we're just going to return
the answer, the sources, and tool used.
So with that being said, let us run
this. Here are the response to sources.
So we can see the similarity score and
the documents retrieved from the vex
store. And then you can see the
generated answer here. And this is all
based on context. But most importantly,
look at this. It was able to identify
that this was a question that required
retrieval calling the knowledgebased
tool. And that was how was able to get
the relevant context and then answer the
question. Now if I switch this up and
instead of the retrievalbased query, we
then ask the web search query, what is
the latest opening our large language
model pass to the web search retrieval
agent. Let's see what happens. So I'm
going to click run again. Okay, so here
we go. So we receive the question, we
generate the text and then we see web
search sources were found and we've got
the full generated answer by the model.
And then if we go further down, we can
see the retrieve docs, which is the
different URLs that were scraped to get
content that was then passed as context
to the model. So you could see how we've
gone from learning about vex store
retrievalss as a standalone action. Then
we went further on to learn about
structured outputs, the faceli SDK,
talked about web search retrieval using
the responses API. And so with that
being said, in the next lesson we are
just going to do a brief overview and
exercise so that you can put this into
action.
So in the previous lesson you learned
about using tools for web search and
retrieval together. So the model is able
to decide based on the nature of the
user's query whether to use web search
or retrieval to answer the question. So
now I want you to take this challenge in
the web search retrieval agent file.
You're going to implement the missing
tools logic to add the tools and then
complete the empty generate text
function. So if I go in using memory try
to add the two tools. So one for
knowledge base and web search preview
and then here you want to fill in the
generate text properties. So all the
properties involved including the model,
the prompt and so on. I'm not going to
say all of them, but just try your best
for memory and I'll be back in a couple
of minutes to provide the full solution.
Good luck.
All right, I'm back. So, let us work
through this. The first thing is we need
the knowledgebased search. This was the
name that we provided in the previous
lesson. And we're going to pass in the
knowledgebased tool which is imported
from here. As far as the web search, we
want to make sure we use this exact
syntax, webc search preview. And then
you're going to pass in the tools
web search preview. And remember, you
open and you leave it this way. This is
the exact syntax that's expected from
the Versel AI SDK. So now we've
completed the tools but we now need to
fill in the generate text property so we
can include a tool. So the first thing
is we need to pass in the model
responses and then you want to pass in
the tool calling model. You've already
defined this up here. The next thing we
want to do is pass in the tools so the
model has access to the tools. And then
stop when remember you want to step
count the number of steps for the model
to iterate over and generate results and
tool results as well as the final
response. We also pass in a system
prompt. This is just kind of to give the
model general instructions to follow.
You can either pass it as part of a
string in the prompt but we just
separated it for good abstraction and we
pass in the knowledgebased description
and finally we pass the prompt. This is
the question that's passed in as a
parameter in here. So we save all of
that back to index.js run the function
and voila we get the sources URL and the
full text as well. So that's it for this
exercise. Keep practicing and I'll see
you in the next lesson.
We have now come to the final lesson, an
overview of the Keystone project which
is the customer support AI agent. As
discussed, this is trained on
Scrimbridge's help center articles which
have been embedded and stored in the vex
store as we've covered in previous
lessons. And so, for example, if I ask a
question, how do I access Scrord?
We should see able to access the
knowledgebased tool as we covered in the
previous lessons. Retrieve the relevant
documents and display to the user. And
of course, we can click and view sources
and see the different chunks or the
different documents that were retrieved,
the similarity score and so on and so
forth. So this is a UI display of a lot
of the concepts that we've discussed
previously. If I ask a question which
requires real-time information from the
web because the support documents do not
contain them, I can do the same thing.
So, how do I resolve the error node
module not found? And again, similar
concept, but this time it's going to go
and rather than retrieving or using the
knowledgebased tool, it is now this time
going to go ahead and perform the web
search. So you can see it's able to
retrieve the answer to the question
based on web search information. And so
you see this dual choice between the
model between deciding should I answer
this question using my retrieval tool go
to the vex store or web search. Now
before we walk through the actual code
let's look at the architecture. So
essentially all this going on here at a
very very high level is the user asks a
question then we have some way to tell
the model hey check if this requires
retrieval if it doesn't require
retrieval of any form whether it's real
time or vexal retrieval just answer the
question directly
if the question requires retrieval then
we want to know if it's real time
information required because if it's not
real-time information then we can go
check the vector store to see if there's
sufficient context to answer the
question. If however it requires
real-time information like the question
I asked or if I ask something to do with
a latest package or a library that I
want to use in this context then it will
go to the web to get the latest
documentation for that particular
package and so this is the architecture
we have in place. So if we go back
essentially just to walk through the key
changes or new things that have been
introduced that you not familiar with
previously. The first thing as you can
see in the screen is we have an express
server and this has been installed here
and this express server essentially is
going to serve as the API route from
which the UI when it's clicked then
JavaScript is going to send the text to
this API route as a question and then
we're going to use the web search
retrieval agent which I'll show you in a
second and so it's listening on port
3000. So that's what the server.js file
is doing. Config is pretty much as
you've seen previously. And then we have
constants also having the constants that
you saw in previous lessons including
similarity match count which is number
of retrieved documents from the vex
store to return back and the models to
answer classify and so on. We also have
an index.html. This is the HTML of what
you can see here. And then we have the
style.css which is design and the
client.js which is the JavaScript
that is listing for clicks to ask and uh
when it gets the click it will assist in
showing the sources and sending a post
request of the question to server.js and
then server.js returns back the answer
and then we have a bunch of processing
going on here as well. The key new
introduction to walk through is the web
search retrieval agent. So what's going
on here? As we covered in the previous
lessons, we have two tools, the
knowledge base, the web search preview,
and as you can see, we essentially pass
in those tools to the generate text
interface by VEL AI SDK alongside the
number of steps you want to take the
prompt which is coming from here. So you
can see we've got instructing when to
use the knowledgebased tool, the web
search tool and when to answer directly
passing off the different sources based
on the tool called and then we return
the answer, the sources and the tool
used. And this is the knowledgebased
tool which we covered in previous
lessons in a separate file for handling
the embedding of the question and then
fetching from superbase the similar
documents. And so that's it in a
nutshell. This is the complete workings
of a customer support AI agent that is
able to apply some level of intelligence
based on tools provided and based on the
prompt decide what tool to execute or
not to execute a tool at all and just
answer the question directly. So you can
see how you can expand on this. You can
add more tools for different things.
Maybe another tool for a particular API
that can retrieve information. Maybe
another tool that performs calculations,
maybe more niche tools that are specific
to different capabilities. You can
expand on this. So, don't stop here.
There's so much more you can do to build
on what you've learned in this course so
far. Use your own support documents or
even another completely different use
case, but using the similar concepts
you've learned throughout the course.
So, I hope you've enjoyed this course so
far. Keep practicing and good luck.
Right, congratulations on finishing this
course. Let's recap what we've learned
so far. You've learned how to create and
store document embeddings of support
documents into Superbase as a vector
store. Then you learn how to build a
retrieval mechanism to question answer
these documents retrieving the relevant
documents to answer the question from
Superbase where they're embedded. Then
you explore the basics about Versel AI
SDKs APIs to generate embeddings,
generate text, construct structured
outputs, and then execute tool calls
based on whatever actions you wanted the
agent to take. You also learned how to
utilize OpenAI's web search tool and the
function calling and to integrate that
as a tool for the agent to implement
when he wants to get real-time
information. Finally, you learn how to
build a customer support AI agent that
can intelligently classify a user's
query and determine whether to answer
the question generally perform retrieval
or web search. Now, with that being
said, my suggestion for a stretch goal
is to refactor the final project for
your use case. Now, perhaps you don't
want to do customer support use case. So
take the concepts of tool calling using
the Versell AI SDK and apply it to
something else. Maybe you want to build
your own travel agent. Maybe you want to
build your own coding agent or an agent
that will do something else that's very
useful. You can apply a lot of the
principles you've learned in this course
to do that as well. So with that being
said, I wish you the best of luck and
thank you for taking this course.
Cheers.
Use the popular Vercel AI SDK to create and ship a customer support agent that makes autonomous decisions to either answer questions based on your support docs or search the web in real time. ✏️ Study this course interactively on Scrimba: https://scrimba.com/build-a-support-agent-with-vercel-ai-sdk-c0lk05ahir?utm_source=youtube&utm_medium=video&utm_campaign=fcc-vercel Vercel AI SDK is a TypeScript-first toolkit for building AI features. It streamlines text generation, embeddings, and structured outputs. It plays nicely with tools, so your agent can call functions like “search the web” or “query my knowledge base” safely and predictably. Scrimba on YouTube: https://www.youtube.com/c/Scrimba ⭐️ Contents ⭐️ 0:00:00 Customer Support AI Agent Intro 0:06:47 Retrieval Augmented Generation Refresher 0:10:05 Embeddings Recap 0:12:57 Vector Database Embed Documents 0:20:19 Vector Database Retrieval 0:28:00 Challenge: Vector Database Retrieval 0:32:51 Vector Database Text Splitting and Retrieval 0:40:34 Vercel AI SDK Basics 0:43:51 Challenge: Vercel AI SDK Basics 0:47:22 Vercel AI SDK Structured Outputs 0:52:38 Challenge: Vercel AI SDK Structured Outputs 0:58:39 Vercel AI SDK Tool Calling 1:08:37 Challenge: Vercel AI SDK Tool Calling 1:18:04 Agentic Retrieval Routing 1:31:52 Challenge: Agentic Retrieval Routing 1:35:08 Web Search AI Agent 🎉 Thanks to our Champion and Sponsor supporters: 👾 Drake Milly 👾 Ulises Moralez 👾 Goddard Tan 👾 David MG 👾 Matthew Springman 👾 Claudio 👾 Oscar R. 👾 jedi-or-sith 👾 Nattira Maneerat 👾 Justin Hual -- Learn to code for free and get a developer job: https://www.freecodecamp.org Read hundreds of articles on programming: https://freecodecamp.org/news