Loading video player...
So, this rag agent took me less than 5
minutes to set up. This is the easiest
method I've ever tried. I'm going to
open up this chat and paste in this
message, which has three different
queries. The first one is, "What was
Tesla's total revenue Q2 2025?" The
second one is, "What was Nvidia's total
revenue Q1 fiscal year 25?" And the
third one is, "What were Nike's revenues
for Q4 fiscal year 2025?" You can see
the agent hit its pine cone tool three
times. And if we look at its answers,
not only is it giving us the correct
answers, but it's giving us the exact
document, the page numbers, and the
exact quote that it found in the PDF.
And that's how we can trust the answers
are correct. So, starting off with
Tesla, if I take the exact quote that it
found in the PDF, and I go into the
Tesla document, and we do a quick
search, we can see that it pulled it
from page 4. And back in Naden, our
assistant said that it found it from
page 3 to 7 on this exact document.
Next, if we go to the Nvidia answer and
we pull the exact quote and then I
control F that in this document, you can
see that it finds it right here. And
this is page one of the PDF and we go
back into Nit and it says that it found
it on page one of the NVIDIA annual
report document. And then finally, if we
go to the Nike answer and we grab this
quote that it found, switch into the
Nike document and do another control F,
we can see the exact quote was found
right here on page one. And in edit, the
assistant said that it found it on page
one. So, those of you that have built
rag agents before, you know that it's
not super easy to be able to get your
agent to actually site exactly where
it's pulling everything from. It's
doable, but you have to set up a
pipeline to do things like metadata
tagging. And like I said, all I did here
was drop in a file and then chat with
it. So, that's exactly what I'm going to
be showing you guys how we do that
today. Okay. So, now that you've seen
that quick demo, I'm just going to
basically build that exact system in
front of you guys step by step. So, feel
free to open up your computer and follow
along with what I'm doing. but also the
entire template that you'll see today
will be available for download for free.
All you have to do is join my free
school community. That way you can just
plug and play and test it out for
yourself. So what we're doing today is
we are using a Pine Cone assistant as
our rag search. So if you've never used
Pine Cone before, it is a vector store,
but in here you can see that we have a
section right over here called
assistant. So if I click on that, you
can see that I have a test assistant,
which is the one we were just using. But
what I'm going to do today real quick is
create a new one in front of you guys.
I'm just going to call this one demo.
and I'm going to create the assistant.
Now, real quick, what you want to pay
attention to is down here at the bottom.
It says active assistants have a fee of
5 cents per hour. So, it's not too bad.
And you can also click right here for
more details about the pricing. So, now
that we got that out of the way, I'm
going to go ahead and create this
assistant. And what you can see is this
kind of looks like a custom GPT
interface where you've got a chat right
here and then you have the ability to
drag in files. So, what I'm going to do
is drag in those three files that we
were looking at earlier, the Tesla,
Nike, and Nvidia earnings reports. And
I'm just going to import those right
away into our Pine Cone Assistant. And
now that those are uploaded in Pine Cone
Assistant, I can paste in that exact
same question that I asked our NADN
agent in that demo. So you can see it
comes back with the correct answers. And
we could hover over each of these links
because we're in the playground mode to
see what PDF it pulled from and then the
page numbers that it got that
information from. So the power of this
comes through the actual API that we can
use to talk to this Pine Cone Assistant.
And like I said, don't worry if this
seems confusing. I'm going to show you
guys exactly what we have to do. And
it's actually super easy. So you can see
over API, there's two things we can do.
We can upload files or we can chat with
our assistant. And in today's video,
we're just going to be focusing on
chatting with our assistant because we
already uploaded our files in this
interface right here. So I'm going to go
back into NADN and we're just going to
real quick grab an AI agent. And now
we're going to be able to hook it up to
this Pine Cone Assistant tool. So when
we give our agent a tool, you can see
that there's no native tool here for a
Pine Cone Assistant, but there is for
Pine Cone Vector Store. So, what we need
to do is grab an HTTP request so we can
talk to our assistant. So, all you're
going to do is go back into Pine Cone
over here. You're going to open up this
little connect thing to get to the API.
And then down here where you see chat
with your assistant, I'm basically going
to copy everything except for the top
two rows of Pine Cone API key and this
blank row. So, I'm going to copy this,
go back into edit, hit import curl, and
then paste that in there. And when I hit
import, it basically fills in the entire
HTTP request that we need except for a
few things that we'll have to tweak. So
the first thing is a Pine Cone API key.
The way you get that is you go back into
Pine Cone. You'll click over here on API
keys and then you'll just create a new
key. So here's what that looks like.
You'd click create API key. I'm just
going to call this one demo 4. We're
going to create that key. And now it's
going to give us this secret value. And
I have to copy this. And after you save
it and you click close, it's going to be
gone. So you can always create a new
one, but just keep in mind that that key
will be gone. So then you'd come back
into Pine Cone and just paste that key
right in there and you'll be all set. So
it's super simple. And one pro tip that
I want to show you guys is if you
already have connected to Pine Cone
through a native node, you could come to
authentication, you can go to predefined
and then you could come in here and just
type in Pine Cone. And that would pull
in your credential of your native Pine
Cone integration. So you can either do
that or you can do this. They're both
the same. So because I'm doing this, I'm
just going to turn off my headers. But
don't let that confuse you. All that is
is basically just putting in our
password so we can access that Pine Cone
Assistant. So now what I'm going to do
is change the description and I'm just
going to say use this tool to talk to
your knowledge base. So we're going to
paste in that as a description. I'm
going to name this tool Pine Cone just
to keep everything simple. And then the
last thing we have to set up is the
actual request that we're making to our
Pine Cone assistant. So right now if we
triggered this tool, no matter what we
asked our assistant for help with, it
would be sending off this query to the
assistant. What is the inciting incident
of Pride and Prejudice? And so we don't
want to have that be the query every
single time. So what we have to do is
change this to an expression, meaning
that these values can be dynamic. I'm
going to get rid of this content right
here. And I'm going to type in two curly
braces that face each way. We're going
to do a dollar sign, and we're going to
grab this from AI function. And this is
just going to let our AI agent determine
what query do I send to the Pine Cone
Assistant. And so all I'm going to do is
type in search query within that little
quotation mark right there. And now this
is basically set up to be dynamic. And
I'll show you guys exactly what I mean
by that if that isn't yet making sense.
Cool. So now all that's left to do is
test out the agent. Of course, we first
have to give it a brain. So I'm just
going to go grab my open router API key
and we're just going to stick with
GBT4.1 Mini for now. All right. So now
the agent's basically set up. All that
will be left to do is test it, tweak the
system prompt, and keep playing around a
little bit. So that's exactly what we're
going to do. First, I'm going to drop in
this query asking about the Tesla
document. How many vehicles did Tesla
deliver in Q2 2025? And how did that
compare to the same quarter in 2024? So
once again, what's happening is it's
hitting our Pine Cone Assistant and then
it's going to come back with some
answers for us. So, you can see it said
that Tesla delivered a total of 384,000
vehicles. And then it gives us a
breakdown of the different models that
it actually delivered. And it also said
that it saw a 13% decrease in total
deliveries year-over-year. I just went
ahead and fact checked all this. That is
correct. Now, keep in mind there's no
system prompt in this AI agent, which is
why we're not getting any source
information like what PDF it pulled from
and what page numbers it found it from.
So, what I'm going to do real quick is
give it a quick system prompt right
here, which is what defines the behavior
of our AI agent. So, I'm going to open
this up full screen, and I'm just going
to paste in this prompt that I was using
earlier. It's really, really simple. I
said, "You are an AI agent specialized
in analyzing earnings reports data. Use
your Pine Cone tool to search through
earnings reports from Tesla, Nike, and
Nvidia." When answering the user's
question, always site your sources as
far as what document you got it from,
what page it was from, what section, and
an exact textbased quote from the
original source. because when we're
doing rag and vector search, it's really
important that we're seeing where our
agent got the information from so that
we can trust it. So, this is just a
great example of defining the agents
behavior. Because what's interesting is
you'll notice when it searched the Pine
Cone tool, we actually were getting that
information. If I was to scroll down
over here, you could see that what we
were getting back from Pine Cone is the
exact pages that it pulled this
information from, as well as the name of
the PDF it found it from. the AI agent
just didn't know to actually tell us
that information. Real quick, guys,
excuse the lighting. I'm sitting here at
nighttime editing this video and I
realized that I didn't show what I meant
by the agent filling in the query down
here by itself. So, if you remember this
question we asked was how many vehicles
did Tesla deliver in Q2 2025 and how did
that compare to the same quarter in
2024. So, what happened here is we can
see the agent called the Pine Cone tool
twice. We can see there were two items.
And if we click in, we can see the two
different searches that it made to our
Pine Cone Assistant. Up in the top left,
if we go to run one out of two, it
searched Pine Cone Assistant for Tesla
vehicle delivers Q2 2025. And then it
knew it had to compare that to 2024. So
then it searched it again and it said
Tesla vehicle deliveries Q2 2024. And so
this search query is the variable that
we have down here because the agent
basically decides what do I want this
search query to be and how many times do
I need to make a search query. And
that's why on this right hand side, we
also have two separate runs and two
separate results. So for all the future
examples, just keep that in mind. The
agent decides the search query and
that's how it's able to get the right
information every time. Again, excuse
the lighting. I'm working on getting a
professional light back here. So if I do
record in the dark, I'm not terrifying
looking like this. Let's get back to the
video. So just to show you guys that
that prompt worked, I'm just going to
repost this exact same message. And what
we're going to see is it gives us the
correct answer once again, but this time
it's going to have a PDF name and pages.
So there you go. We got the same answer
and this time we got our source of Tesla
Q2 and we have our page numbers down
here. But what you may notice now is
that we're not getting the textbased
quote from the PDF. And the reason why
we're not is because what happens in
this Pine Cone node, and I'm not going
to try to get too technical here, but
the Pine Cone assistant on that server
searches through the knowledge base and
then it gives us basically like a short
summary with the correct answer. This
right here is not an exact textbased
quote, but this is what our NAN AI agent
uses to give us an answer. And just to
highlight that point, I'm going to show
you guys a better example of that. I'm
asking, what were Nike's revenues for Q4
fiscal year 2025? And how did they
compare to last year? And what you're
going to see happen is it gives us the
correct answer, but now it gives us the
exact quote, which is wrong. So if I was
to copy this quote and go into the Nike
doc, which was this one, and if I paste
this in, we don't get any hits. We get
zero matches. And that's because once
again, what happens is the Pine Cone
content that comes back is a summary.
This is the Pine Cone Assistant answer
that it made based on the exact text. So
the way that we fix this is we have to
understand the type of request we're
making to the Pine Cone Assistant. And
in order to do that, we have to look at
the API documentation, which I will drag
over right here. So this is basically
the documentation that tells us how to
use the assistant over API. We can see
things like streaming response,
extracting the response content,
choosing a model, all this kind of
stuff. But what I'm interested in is
this bottom section that says include
citation highlights in the response. So,
all we have to do to get an exact site
is add this thing down here that says
include highlights. This is basically
just a little lever that we can pull to
change how the assistant works. So, keep
in mind how this quote didn't work. I'm
going to go into the HTTP request and
all I'm going to do is add this little
section include highlights equals true.
And now I'm going to save this and run
the exact same question. So, we're
running the exact same question. the
answer is going to come back correct
again because the Pine cone assistant is
going to be pretty good at getting it
correct. But now we're going to see an
exact quote come back as well. So right
here we get this exact quote. I'm going
to copy this, go back into the Nike doc,
and now if I paste this in, we can see
it's pulling the exact quote now. And
the reason it was able to do that is
because in Pine Cone, once again, this
section called content is not the exact
quote. But if I scroll all the way down,
you can see we have a new section that
got added called content. And this is
basically what's going to pull exactly
what was found in the document. So this
is just a really great example of not
only being able to understand what's
happening, but being able to understand
that if I want to change the behavior, I
can go read through API documentation.
And this is how I can also change things
like the sampling temperature or I can
change the model because one thing that
you will notice is that if you're in
your assistant playground right here and
you change the chat model to interact
with, you can see I switched the model.
If I hover over this, it says select the
model for the assistant to use in this
conversation to persist this choice in
your application. Remember, you have to
set your model in your API calls. So,
going back to NADN in the HTTP request
down here, we have a model. And so, if I
wanted to change what model we're using
to interact on the Pine Cone Assistant
side, we would have to change that right
here. All right. So, sorry if I was
getting a little technical there. I just
think it's really important that you
guys understand how this is actually
working. If you're looking to understand
this stuff a little bit better, I have
an API video that you can watch. I'll
tag it right up here. Or you can also
check out my plus community. The link
for that's down in the description.
That's where you'll receive more
structured guidance and a community of
people that are building every day.
Anyways, real quick now, I want to show
you guys the difference because I'm sure
you're wondering, why would I not just
use a Pine Cone vector store or a
Superbase vector store? Why is this
quicker and easier and maybe even
better? So, the reason why I'm a big fan
of this right now is because what's
going on here is Pine Cone on the back
end is handling all of the indexing,
embedding, chunking, it's handling all
of that in order to make your job easier
so that you don't have to manage all of
that to get your accurate sources back.
Cuz like I said, if you're taking the
typical vector approach, there's a lot
more work that goes in up front. You
can't just drop in a file and be good to
go. You have to set up a pipeline to
have like metadata filtering and maybe
even different types of splitting and
chunking. I'm not going to dive super
deep into this right now because I don't
want the video to go too long, but I
want to show you guys a real quick
comparison. So, all three of these
agents have the exact same prompt and
the exact same documents except for
they're just being stored somewhere
differently. So, I'm going to drop in
this question right now to our Pine Cone
assistant that asks, "What was Tesla's
operating margin in Q2 2025?" So, you'll
see it comes back with the operating
margin was 4.1%, which is correct. You
can see it gives us the document. It
gives us the page numbers as well as an
exact quote. And also keep in mind that
we got this answer back right here. It
says 1,277
tokens. So keep that number in mind. Now
if we move over to this middle agent
that's using a Pine Cone Vector store
rather than the assistant, we're going
to ask the exact same question. And
remember, we're looking for the answer
4.1%.
So what we get back is I don't really
know what the answer is. And you can see
that it took almost 30,000 tokens. So we
got an incorrect answer. And it cost us
like 15 times more. More than 15 times
more. Yeah, that was horrible math, but
you get the point. And because we took a
typical vector chunk approach where we
were in charge of all of the
pre-processing of those chunks, I'm
assuming it's going to do the exact same
thing when we try this with our
Superbase vector store. So, you can see
here pretty much the same answer. It
wasn't able to find the exact figure. We
were looking for 4.1%. And this one only
took 5,000 tokens, which is still like
three times more expensive than the Pine
Cone Assistant. So, that's all I'm going
to talk about today. I'm not saying that
the Pine Cone Assistant is always the
best option because you also have that
running cost of 5 cents every hour. But
the point I'm trying to make is if you
want to spin something up quickly and
play around with how it's working, this
Pine Cone Assistant is a game changer,
especially if you're a beginner and you
want to experiment with stuff like rag
agents. And also wanted to give a quick
shout out to the legend Mark Cashef.
He's the one that showed me these Pine
Cone Assistants and like I said, I think
they're super cool. So hopefully now you
understand how you can test out the Pine
Cone Assistant for yourself. And if you
want to download this template so you
can just kind of play around with the
differences, then you can get it for
free by joining my free school
community. And if you like seeing stuff
like this and you want some more
structured learning, then definitely
check out my plus community. The link
for that is also down in the
description. We've got a great community
of over 200 members. Everyone every day
is building and earning with NADN. And
we have a full classroom section with
three full courses. Agent Zero is the
foundations for beginners. 10 hours to
10 seconds dives into NAND. And then our
new oneperson AI agency course is
available for our annual members and it
goes over how to lay the foundation to
build an AI automation business. So I'd
love to see you guys in these
communities. But that's going to do it
for the video. Hope you guys enjoyed. If
you learned something new, please give
it a like. It definitely helps me out a
ton. And as always, I appreciate you
guys making it to the end of the video.
I'll see you all in the next one. Thanks
guys.
Full courses + unlimited support: https://www.skool.com/ai-automation-society-plus/about All my FREE resources: https://www.skool.com/ai-automation-society/about Have us build agents for you: https://truehorizon.ai/ 14 day FREE n8n trial: https://n8n.partnerlinks.io/22crlu8afq5r In this video I show a new, faster way to build RAG agents in n8n using the Pinecone Assistant to handle the heavy lifting, no preprocessing pipeline, no manual chunking, no custom embedding flow. You simply drop your documents in and Pinecone takes care of ingestion, splitting, and indexing on the back end. Then we wire it up in n8n to get accurate answers with trustworthy, page-level citations and exact text quotes pulled straight from your knowledge base. If you’re a beginner, this is the easiest path I’ve found to stand up a production-ready RAG agent; if you’re more advanced, it’s a huge time-saver for rapid prototyping. Follow along step-by-step as I connect n8n to Pinecone Assistant, load docs, test grounded responses, and return clean sources you can show to clients or stakeholders, all with a simple, no-code build that you can replicate in minutes. Sponsorship Inquiries: 📧 sponsorships@nateherk.com TIMESTAMPS 00:00 Quick Demo 01:32 Setting Up Pinecone Assistant 03:42 Connecting Assistant to n8n 06:14 Improving the RAG Agent 10:33 Pinecone Assistant API 12:52 Performance Comparison 15:29 Want to Master n8n?