Loading video player...
Let's build an AI agent that is
optimized for handling PDF data. I'm
going to be taking this dummy invoice
data here, but the steps and logic
you're about to see, you can apply to
any type of PDF data that you want to
input into this workflow. So therefore,
by the end of this video, you're not
only going to learn how to extract data
effectively from PDFs, but you're also
going to learn how we can take that data
and leverage it in other applications.
This specific video, I'm going to show
you how to do with Google Sheets, but as
you already know, you can apply that
logic anywhere else. Let's jump in.
Welcome, Becky. Today's video comes from
a suggestion from our community here
down in the description down below.
Completely free to join. Sarica
basically asks, "How do we take it so we
can get a PDF here and automatically
give it to an agent, get the data,
extract it, and use it?" So, we're going
to show that real quick. So, to do that,
go ahead and start your new agent
workflow. Start stays the same, but the
first little agent that we're going to
build together here is going to extract
the underlying data. Now, I'll be honest
with y'all. At first, I was like, "Oh,
are we going to have to use assistance
API here to get vision context? are we
going to maybe add an extra layer here
in order to see the data that's actually
on the PDF? But in reality, we don't
have to do any of that. What's cool
about these agents is that builtin is
vision context. And if you don't know
what vision context is, essentially when
dealing with a PDF, this is obviously a
textheavy PDF, but some PDFs have maybe
more images or diagrams, whether that's
for real estate or engineering.
Therefore, vision context is important
to understand those diagrams and for the
information we want to get out of it. In
theory, when looking at an invoice PDF
like this, which is very much textbased,
we could programmatically create a
Python script to extract all the
relevant data. But let me just show you
a one-size fit all to get this data. And
the one size fit all is going to be
using this vision context that's built
into agents. So, first thing you want to
do is provide context of what the PDF
is. Mine's an invoice. Look at this
invoice PDF. What the heck is yours? I
don't know. Is it like the most amazing
dog treats and a product description of
dog treats? whatever it is. First line,
context, PDF. Second line, we're going
to identify what information we want to
extract. So, for me, in today's video,
we'll extract the company address, the
total, and the invoice number. But don't
worry, we're going to gut check this to
make sure that the real data is getting
pulled in. It's not hallucinating. So,
this is the address, this is the invoice
number, and then the total is $262.50
USD. That's it. So, what is the relevant
data that you care? This can be much
longer. If anything, we could format
this a little bit better so it's easier
to read, more legible. Add the data
points that you care about that's coming
from that relevant PDF. In addition, you
can also extract different things such
as summary, more open-ended data points.
When I say open-ended data points, I'm
essentially saying maybe more
analysisoriented data points. So, for
example, in the invoice PDF, maybe just
like a give me a two-sided summary of
the client and the customer. It doesn't
really apply too well to an invoice
example, but maybe for that real estate
example, you'd want to get information
like, give me a summary of based on this
location, is this a good purchase, based
on square footage and location
desiraability. Keep that in mind. So,
for me, in theory, I could do summary of
the invoice. Not really relevant. So,
right now, we're just going to extract
three data points that you can expand to
however many data points you care about.
It's safe. With this done, we can scroll
down here. We got the model GBT5. That's
fine. If you find yourself running into
issues or errors or it's not working as
effectively as it should, simply come
over here to reasoning effort and then
increase it to high since the purpose
for me simply is invoice invoice data
and three specific data points.
Honestly, I can leave this as low. Fast
in, fast out. High gives the model just
more IQ. So for your task, does it
require more IQ? Tools. We're going to
go ahead and give no tools here. Don't
worry, it's actually built into the
agent. And then the response format will
be JSON. This is fundamentally important
for the next step here for it to be a
JSON output. All right. So here we go.
This is going to be the structure JSON
output. I can rename this. So maybe we
do like invoice schema voice data to
make it more sense here. Add property.
And the property name is going to be
what we identified in the prompts. So
for one was company. We'll just do
company address. The other one was
invoice
number. And then this one was a number
because it's a number. I'm going to hit
update. And then finally total USD. But
I'm going just do total USD and we can
make this a number as well.
Fundamentally, if you want to deep dive
on what all these mean, I suggest you
just do a screenshot, put it into an AI
chat real quick. String just text number
full, true or false. Enum categorization
object a little bit more complex. Array
is going to be like a list. All you need
to care about for right now because most
of the times you're extracting data is
either going to be a string or a num.
When I say num, I mean number update.
With that done, then the next step here
becomes fundamentally way easier. So, we
got our data being formatted from the
PDF. In the user data section here,
we're going to leverage Zapier. And what
Zapier is going to allow us to do is to
then take it to our Google sheet here.
And with our Google sheet, we should see
the data come in as the second row here
automatically. So, we're going to go
ahead and first create a prompt. Place
this data in the relevant Google sheet
column. Identify where you're placing
the data in the specific software this
data is being placed into. In theory, if
you're not really placing it anywhere,
maybe you're just sending it to a Gmail,
just say that as well. Whatever the use
case for the data is, identify it here
as the first line. Next, we're going to
identify the data again. So, what I'm
going to do to make my life easy is make
sure you use the exact same dictation
that you used in your actual wherever
you're placing the data. So, invoice
number, copy, invoice number, semicolon,
company address, semicolon, total USD,
semicolon. Now, here is where we add
context. Context is going to be the
invoice number found here, invoice
number. So, it wants to place down
there. I don't like that. Don't do that
to me. Place it right here. And then,
what you'll notice here is that we can
add the other ones. Input output parsed
invoice. No, no, not the invoice number.
This will be the company address
address. If you're wondering, Corbin,
how'd you know it was that? Because if
you go back here, go to the invoice
data, we called it company address.
Nice. Coming back over here though,
assembly, open this little up total USD
context here. Total USD. There we go.
So, we got all three data points being
placed. Place this data in the relevant
Google sheet column. There we go. If you
have more data points, add the more data
points. Hit save. Now, this one because
it's a little bit more complex. I'm
going to do to high. In theory, I could
go medium. Test it. See if it works. But
especially in the beginning, y'all
always opt for high. Just get it
working. Once it's working, then you can
kind of play around reasoning effort.
Next, we're going to do a tool of
Zapier. So, first we're going to do add
MCP Zapier. Get your API key. For me,
I've already created one. So, simply go
to connect here. Copy secret. It's a
secret. Zap year. Enter it here. If
you're wondering what that is, why you
even need to do that, essentially, this
is tells OpenAI that you have access to
an actual Zap year account and the
functionalities you're about to see.
We're going to add these tools together.
Don't worry. For now, I'm going to
uncheck these. And the tool tools we're
going to add together is creating a
spreadsheet row. The ability to actually
functionally create a spreadsheet row
within software. And then on top of
that, get data from that spreadsheet at
first. Add. So now coming over to Zapier
MCP, let's make sure we add those tools.
I have already added them. You know what
I'll do? I'll remove them from the
server and we'll add them together. So
we're going to add tool Google sheet. In
theory, we could add all the tools, but
I honestly suggest you not to do that.
First, let's just do lookup spreadsheet
rows. This is going to give the ability
for the artificial intelligence model to
get all the relevant data found in the
spreadsheet. We're going to do configure
here and then within configure, we're
going to make sure we choose the correct
spreadsheet. So, I'm going say set
specific value for this field. This is
so that we make optimized decisions and
it doesn't get confused. For me, I
called it easy data. Why? Because it's
easy. So, therefore, I'm going to go to
here, easy data worksheet. If you have
multiple worksheets, choose one. I only
have one, so I'm going to hit save. Add
another tool. Now, let's give the
functionality for it to actually create
a row. To do this, we can type in create
up here. Find data, take action. What's
nice is that you can see all the things
it can actually do functionally to that
spreadsheet. So, we're going to create a
spreadsheet row. But I want you to
notice a couple things. First thing,
create multiple spreadsheet rows. What
does this mean, Corin? This means in
theory, we could build out an agent here
that essentially if I provided like 10
invoice PDFs, it would be able to loop
through them and create multiple
spreadsheet rows. The logic's a little
bit more complex. So, obviously just do
it with one at a time at first just to
get it working. I'm going to come over
here to configure. In configure, we're
going to select that spreadsheet again,
which is going to be easy data.
Worksheets's going to be the same. It's
fine. It's all good. We're going to hit
save. Now, notice two things. First
thing, notice that we've identified
specifically the actions of create
spreadsheet row, look up spreadsheet
rows. We can give more actions here. And
you know, MCPS can give actions across
all these different apps over 8,000. And
then on top of that, we've identified
the specific area we want to place that
data, which for me was easy data. And
that's what we called our Google sheet
here. Nice. And with this done, because
I essentially readded those, let me see
if the tools are selected here
correctly. So yeah, let me do that
again. Update. Nice. And here we go. We
got our prompt. We got our MCP. We get
the reasoning effort to high. Let's see
if this works. Go to preview. I'm going
to add my invoice data PDF that I showed
earlier. This one right here. So we got
our invoice data PDF. Enter. First thing
it's going to do is extract the relevant
data that we care about, which was the
address, invoice number, and the total
USD. We identified this. And once it
extracts that, it's going to put it into
a JSON output. Don't over complicate it
at all. JSON output is just a way we can
format data. So it effectively can be
run thousands upon thousands of times in
a structured manner. This is how the AI
likes to talk to each other. Okay? Or
software in general as well. Here we go.
Company address invoice 7262.5.
Now right now in the workflow, it
requires me to approve the action using
Zapier. I'll show you how to make it so
that it always just does it
automatically because that can get
frustrating. But we should be
essentially executing right now. And
there we go. Invoice number, the company
address, and then 262.5 coming over
here. 262.5. Nice. Now, some of y'all
might be like, Corbin, where's all the
zeros? Like 0000. Okay, we could format
it that way if we want to. Or
alternatively, the total USD, why is it
formatted that way? We'll just here to
format as currency. Okay, but that
worked perfectly and that executed how
we like. So, in theory, I can close the
preview, come over here, go to Zampier,
simply click approval to never require
approval. So, it just essentially just
does it. You don't need a yes. There we
go. Make sure to leave a like. It is
completely free. Check out that school
community in the description down below.
Let me know what other use cases you
want to see using agent builders. One
big thing I already know half of y'all
are going to be like, "Corbin, this is
cool, but and that but is essentially
you saying, "How do we actually put this
into an internal app for my team, push
this to an actual website?" Basically,
how do we use that chatkit UI? Well,
lucky for you, I've already created an
entire video dedicated to how to take
workflows like this and actually
implement them to real websites or
internal apps for your company. So,
without further ado, I'll see you in the
next Did we just learn how we could take
basically any PDF, extract the data
automatically, and put it anywhere
video? Nice.
subscribe for more ► https://bit.ly/3zlUmiS follow me on twitter (x) ► https://twitter.com/corbin_braun join our ai community (free) ► https://www.skool.com/ai-for-your-business follow me on instagram ► https://www.instagram.com/corbin_braunlich follow me on tiktok ► https://www.tiktok.com/@corbin_braunlich watch me live ► https://www.twitch.tv/itscorbinbrown join my software ► https://bumpups.com/ steal my software ► https://github.com/coffeefuelbump LINK TO EVERYTHING ► https://linktr.ee/corbin_brown my recording setup: https://www.amazon.com/shop/corbinbrown Agent Builder - OpenAI API https://platform.openai.com/docs/guides/agent-builder #openai #agentkit Let's learn how to use OpenAI Agent Builder Playlist (bookmark) https://www.youtube.com/playlist?list=PLJrzt4ameiaOOIhNhedMcfN8oJXC5LwPv Become a Builder + Perks 🛠️ https://www.youtube.com/channel/UCJFMlSxcvlZg5yZUYJT0Pug/join