Build ANYTHING with Gemini 3 Pro and n8n AI Agents | DailyDevLists

Loading video player...

Full Transcript

5,858 words • EN

Google just dropped Gemini 3 and there are some big claims about this model already.

On the official release blog, it says a new era of intelligence and that Gemini

3 is our most intelligent model that helps you bring any idea to life. So

you can go ahead and read through this and you can get some information about

the model, the deep think and some of these benchmarks, but I'm going to be

going over a lot of this stuff with you guys today. So I've been seeing

some crazy demos all morning where people have been coming over here to the Google

AI studio and using Gemini 3 Pro preview for completely free. Right now, it's only

actually over the API that you have to pay for Gemini 3 Preview and that

is where you get this input and output cost per million tokens. But like I

said, I've played around with it a bit in here and I've seen some really

crazy demos of people actually building things because what you can do is you can

describe your idea right in here. It almost looks like a lovable or a base

44 type of interface. And so once you actually fill in a prompt of what

you want to build, it can do tons of crazy things. You can make games,

apps, landing pages, all that sort of stuff. And so if you just look at

some of these examples that have been built with Gemini 3 Pro, they look absolutely

incredible. Gemini 3 is really, really good with like the whole UI design, but also

with all of the code that actually goes into the back end of all of

these different landing pages. And you can also see those ideas can be turned into

games. So right here we've got a runner game, right here we've got some sort

of pilot game. You've got all these different interactive examples of games that you can

build just because of how good Gemini is at reasoning and coding. So I'm sure

all over X and LinkedIn and YouTube, you're going to see some crazy examples of

what people are able to build with Gemini 3 Pro. But today what we're focused

on is actually using it over API and how we can actually integrate Gemini 3

Pro with NNN to power our AI automations and make our AI agents faster, smarter,

everything like that. So I've done some playing around with it myself, but I wanted

to do a live experiment with you guys where we're connecting to Gemini 3 Pro

in NNN. We're then going to test its ability to understand images. We're going to

run a massive amount of context through it and ask it some pretty tough questions

and see how it can handle them. And we're going to have Gemini itself actually

build NNN workflows for us and see how well it does. And throughout these three

experiments, we're actually going to be comparing Gemini's results to different chat models so we

can see how much better it really is for the price. So before we actually

hop into NNN and start experimenting, I just wanted to real quick go over some

release info as well as those benchmarks that we saw on that release blog. So

today is November 18th and Google just dropped the model today. Right now, it is

being called Gemini 3 Pro Preview. So this is available in the Google AI Studio

like we just saw, and it's also available through developer platforms. So if you go

to Gemini API documentation, you can see we have a Gemini 3 developer guide, and

it will walk through what it is, how you can use it. also does some

pricing and context window information and some API features about Gemini 3. And I wanted

to real quick zoom out and show you guys a comparison between Gemini 3 and

then sort of the 2 .5 family where we had Flashlight, Flash, and Pro. Now

you can see they roughly all have the same in and out context window, meaning

they can take in about a million tokens and output 64 ,000 tokens. But the

pricing is where things get interesting. So starting with Flashlight, we've got 10 cents and

40 cents, and this is across a million input or output tokens. For 2 .5

Flash, we've got 30 cents and $2 .50. And then for 2 .5 Pro and

3 Pro, we have, you know, it's a little more expensive, but it's also adjusted

if the tokens go over 200 ,000. But what you can see is that Gemini

3 is going to be more expensive than Gemini 2 .5 Pro. But I think

that's going to be justified because it's going to be the most advanced, the best

reasoning, and it's truly multimodal. So here are those benchmark statistics that we saw on

that release blog, and I'm just going to dive into a few of them here.

But we're comparing Gemini 3 Pro, Gemini 2 .5 Pro, Claude Sonnet 4 .5, and

GBT 5 .1. So what we're comparing here are kind of the best and latest

top tier closed source chat models. And so for every benchmark, whatever's in bold is

the one that won across that category. And if you look through all of these,

I mean, Gemini 3 Pro is basically leading every single category. Some just by a

little bit, but some like the ScreenSpot Pro is leading by pretty much double the

next one, which was Claude Sonnet 4 .5. And this just means the chat model's

ability to look at something like an image and understand it. And then this is

just that second half bottom section of the same image, the same benchmark statistics. And

we've got some different ones to look at here. And honestly, one of them that

I think is the most interesting is this Vending Bench 2 benchmark. Which if you

guys haven't heard of this, it's actually kind of a cool experiment they do. They

have these AI models basically run a virtual vending machine business where the AI model

would be in charge of ordering more inventory, adjusting prices, things like that. And the

idea is that AI right now is typically really, really good at sort of like

one off tasks and short term objectives. But when you start to give it a

long term goal that it needs to sort of accomplish and iterate upon over months,

that's where they start to fall off a bit. But seeing that the leader before

this was Claude Sonnet 4 .5 with about 3 .9k net worth on their vending

machine, you can see Gemini 3 Pro came in with almost 5 .5k. So it's

really cool to see that these AI models are moving more towards being able to

actually take on long horizon agentic tasks. So anyways, that's enough of these boring slides.

Let's hop into NNN and just start playing around with the model and seeing how

it actually works. Alright, so I told you guys we had some different experiments that

I have prepared. But first, what I want to do is just show you guys,

if you've never connected to Gemini before, how do you do it in NNN? And

there's a few different ways. So the first way is you go ahead and just

type in Gemini, and you can pull up the Google Gemini node itself. And now

what's cool about this is you have different types of actions. You can analyze audio,

you can analyze a document, you can upload files, you can analyze images, you can

analyze videos or generate videos, and then also just send a message to the model.

And so if you choose one of these operations, what it's going to ask you

to do is create a credential where you would basically need to go get a

Gemini API key. And so if you go to the Gemini API documentation here, and

you could just type in like Google API, whatever. You can see up in the

top section right here, you have a button called get API key. And this will

take you to Google AI studio where you can go ahead and create a key

and just put in your billing information right here. And so what that's going to

do is it's going to give you an API key that looks like this, and

you would just need to copy that and then go ahead and throw that right

here into NNN. And that's really cool because if you look at a different node,

like for example, an open AI node, it doesn't have the ability to analyze a

document. So this one has more power, but you can also take the chat model

approach. So let's say we're throwing in an AI agent. And when you have an

AI agent, of course, you have to give it an AI model to use as

a brain. And so when I click on chat model, you could go ahead and

grab Google Gemini chat, and this is actually going to be the exact same API

key and credential that you already set up here. So if you did this there,

you'll have this all configured already as well. But if you're like me and you'd

rather use open router, you can certainly do that as well, and open router is

basically a connection that lets you route to tons of different chat models. So this

is what it looks like when you go to the page open router dot AI,

and you click on models, and you can see we've got Google Gemini 3 Pro,

but we also have tons and tons of other types of models. And so the

reason I like this is because I can just keep all of my billing information

in one spot, rather than having a Google key, an open AI key, an anthropic

key, all these other keys. So once again, you'd go ahead and sign up and

then you'd be able to get an API key for open router, add your billing

information. And then once you have that connected, you could go ahead and search for

Gemini 3 Pro, which is right here, Gemini 3 Pro preview, and then you'd be

connected in NNN. So fundamentally, what's going on with all three of these methods is

we're using Gemini 3 over API, because we're talking to it from NNN. And so

when you're using a model over API, there are a few settings that you can

look at. So before I show you guys what I mean by that, if we

go back into the Gemini 3 API developer guide, you can see that we have

new API features in Gemini 3. The first one is thinking level, we also have

media resolution, we have temperature, we have thought signatures, things like this. So the reason

that I wanted to bring this up is because if you want to look at

tweaking the model behavior a little bit, you have to adjust the settings. And thinking

level might be one of those settings that you want to adjust. And what you'll

notice here is that right now we have low, which minimizes latency and cost. And

sometimes you might want to use Gemini 3, but not have to pay for all

that reasoning and time. And high is the default setting. It looks like medium is

coming soon, but right now it's not supported. But if you go in here and

you've got an AI agent, and let's say you've connected it to Gemini 3 Pro

here, and you want to make the thinking level low, you would probably think you

could go here in the settings and change that, but that's not there. What we

have is max number of tokens, we have temperature, top K and top P, and

we have safety settings. And then you think, okay, well, maybe if I connect to

Gemini 3 Pro and open router, I'll be able to change that. Well, in open

router, if we're connected to Gemini 3 Pro, we have a few different options and

a few extras, but we still don't yet have the thinking level. And I'm sure

what's going to happen is it will come to this node over here, but right

now it's not there. And then you might think, okay, well, let's just try the

native LLM call here to Gemini 3 Pro. We're going to choose the right model

and add options down here. We have a few other ones. Now we have a

thinking budget, but that is not explicitly the same thing as setting a thinking level.

So the way that you would want to actually be able to control that and

know for a fact that you're changing that parameter is you would want to set

up your own HTTP request directly to Gemini. And so of course you would go

to the Gemini documentation and you would basically figure out how you can set up

that request. And the way that I got there was I took a look at

this documentation and I see if we have high thinking, this is our curl request.

But if we go to low thinking, it changes that curl request and it adds

this little section down here where we can now customize the thinking level. So that's

exactly what I did in this HTTP request right here. Same thing. We have our

URL that it gives us. We put in our API key and then down here,

not only are we sending over a text message for Gemini 3 Pro to look

at and answer to, but we have the thinking level down here, which is low.

And so you can see, I can go ahead and run this and it's going

to do the exact same thing as any of these other nodes are doing. Whether

we go here or whether we go here or here, except for now, we know

for a fact that we had our thinking level turned to low. But anyways, now

that we've looked at the different ways that you can go ahead and connect to

Gemini 3 Pro and NNN, let's take a look at the different examples and evaluations

that we have set up today. All right. So we're going to go ahead and

start with image analysis. I'm going to upload an image and we're going to have

OpenAI analyze it as well as Google analyze it. And we will compare the results.

So I'll kick off this workflow. I'm going to drag in this criminal justice process

flow chart. And what I wanted to show you guys real quick is the prompt

for both of these nodes. I'm saying in both of them, describe the process in

the image and that's it. All right. So OpenAI said, this image illustrates the criminal

justice process from the point of an incident to sentencing, outlining the decision making steps

and possible outcomes. So it says you start with incident or alleged crime, and this

is where law enforcement investigates and determines if there is probable cause. It says number

two is felony referrals or charging and filing, and we've got summons bond hearing. We've

got case processing and resolutions. And from there, there are several possible paths. Okay. So

it understands the structure, but not super detailed. Now, if we take a look at

Google's answer, we have a little bit more detail here and it's the same type

of flow. We start with incident and initial charges. The process begins here. Then the

path splits into misdemeanors or felonies. The next would be first appearance. If the case

is filed, either misdemeanor or felony, it moves into the summons bond hearing. And then

it actually goes into detail down here about all the different paths from there. So

I know that was just a really quick use case, but hopefully you can see

from here how good Gemini is at actually understanding the structure of these flow charts.

But anyways, depending on your use case, it might be different the way that you

need an image analysis or video analysis in your workflow. So let's say maybe you

want to automatically scan in certain maintenance tickets or types of requests like that, or

maybe you rent out cars, whatever it is. So let me just go ahead and

do a few more examples. This first one is me sending in a picture of

some wall damage. So maybe you are a landlord or something like that. And we're

going to have OpenAI and Google once again, tell us what they're seeing in the

types of damage. Okay. So here's the image I just submitted and here's OpenAI's answer,

which is that the image shows visible water damage along the lower section of the

interior wall near the floor. The wall has a large irregular brown stain indicating water

intrusion. There's also peeling and bubbling. So the paint has bubbled and peeled away from

the wall in numerous spots and there's potentially mild mold or mildew. So that's what

we got from OpenAI. Now here is Google's answer. It says similar things. There's water

or moisture damage. There's severe peeling and flaking paint. There's water staining and discoloration. So

the overall diagnosis is that because of all this damage that we can see here,

there's probably a leak from behind the wall or previous flooding where water wicked up

from the floor. All right. So just as another example, let's do a car scratch

and it's going to be very minor and maybe even a little tough to see.

So we'll see if maybe there's a difference here between OpenAI and Google. All right.

We got the image on the left and you can see a little bit of

scratching down here. And here is OpenAI's response. So we see scratches and paint damage.

Noticeable surface scratches are present on the lower part of the rear passenger door, just

above the side skirt. A minor dent along the lower rear edge of the rear

passenger door, which I guess it looks like there is a little bit of a

dent, but not super visible. Now, if we go look at Google's response, we have

damage located the rear passenger side of the vehicle. There's wheel arc or dog leg

area, which is the most severe damage. We've got paint damage and rust, which that

definitely looks to be true. And the summary is that the vehicle appears to have

sideswiped and abrasive object like a low wall or a pillar. And this resulted in

long scuffs on the door and a deeper impact near the wheel well that has

since began to rust. OK, so the point here is that honestly, both of these

models are doing a good job at analyzing these images. But with the actual benchmark

showing that Gemini 3 Pro is significantly better than all the other models at it,

then I would definitely be using Gemini 3 Pro for your automations where you need

some image analysis. So we know that Gemini 3 has a context window of a

million tokens that it can take in and its knowledge cutoff is January of 2025.

So a lot of times you're going to have to give it access to like

a vector database or just put knowledge in its context window in order for it

to actually give you accurate results. So what we're going to do here is take

this 121 page PDF, which is a 10K of Apple, and I'm just going to

copy all of this text. And we're going to go ahead and put it in

the system prompt of this AI agent right here. And so now all

121 pages of the PDF live in the system prompt, which the AI agent will

look at every time it needs to answer a question. And so what we're going

to do here is run an NNN evaluation, where we're going to pass through these

10 questions with 10 correct answers. And then we're going to have GPT evaluate how

close the answers are to the correct answers. So in our chat model, we have

Gemini 3 Pro preview. And if I switch over to evaluations, you can see that

this is being ran right now. So I'll check back underneath you guys after all

10 of the questions have been answered by Gemini 3 Pro, and we will see

what the correctness score is. And while we're waiting for this, if you don't fully

understand the NNN evaluations feature and how that works, I made a full video. I

will tag that right up here if you want to give that a watch. All

right. So this run just finished up and you can see that we have a

correctness score of 4 .6 and that's out of five. So that's pretty high. The

other thing to notice is that for total tokens on average of all 10 runs,

it was under 100 ,000. It was coming in at around 98 ,000 tokens. And

so that just goes to show with a model like this that has a context

window of a million tokens, a 121 page PDF like this one is not even

a tenth of the way to fulfilling that entire context window limit. So we got

4 .6 here as the correctness. I'm going to go ahead and switch this to

a different chat model, and we're going to see how correct it gets it from

there. All right. So Gemini 2 .5 flash came back with a 4 .5 correctness.

So still pretty good, but not as good. But what we do see is that

it was much cheaper and much quicker. So in your scenario, if speed and cost

are a huge factor, then you would probably want to experiment more with a Gemini

2 .5 flash for this use case. But keep in mind, this is like a

very, very simple example, just so we can actually run a few evaluations. So I

went ahead and ran one more evaluation with GBT5 mini, and you can see it

actually did score a 4 .6 as well. But really the point I'm trying to

make here is that 10 of these examples is really not enough to know which

of these models is the best. And it's not as simple as like which model

is best. It's which model is best for this specific use case. And that's how

you can get in here with your evaluations to understand that. All right. Now, moving

on to this final section down here, we're going to have Google Gemini 3 Pro

actually build us an NNN workflow. And I'm not going to explain exactly how this

flow is actually working. If you want to see the full video, I will attach

that right up there if you want to check it out. But let me just

go ahead and ask this workflow to build something for us. All right. So I

just shot that off. I basically asked it to create an automation that will take

my Firefly's call recording transcripts and analyze it. will look up the person that I

had a call with and do research about them and their business and give me

an internal brief about what I can actually do to help them, sort of like

an informal AI audit discovery roadmap. So this Gemini 3 Pro NNN

builder right now is going to create the JSON. And the reason why I wanted

to try this is because we saw some really cool benchmarks with Gemini 3 Pro

about coding. And so it'll be thinking about how to actually structure this workflow, but

also building it for us. And then it will create that NNN flow. And we'll

go ahead and look at it and see how good it is. So I'm just

going to open up this link real quick that it provides us. And this should

load up, hopefully, a workflow that is pretty robust. Okay. So the only thing that

it really messed up here on first glance was the chat model. And that's simply

because this is kind of an outdated prompt that I was using. So let's just

say it threw in an open router model and we're connected. What we've got here

is the webhook we would configure to Fireflies to immediately send a transcript to us.

We would then feed that transcript information into this AI agent. And this one is

prompt to analyze the transcript, use the research tool, find specific pain points, synthesize the

transcript, and generate an internal brief. And the brief must have a prospect profile, meeting

summary, pain points, and AI audit discovery. We can also see for the research tool,

this is supposed to be using Tavily. And it's already filled out this whole request

for us. So we've got Tavily there. We also have the actual body setup besides

our Tavily API key. And it's actually interesting because this is using an outdated version

of the HTTP request. As you can see, it's using 1 .1, which has been

deprecated. But once again, the reason why it's doing that is because I fed it

in some knowledge about NN, and it's using some older versions of models. But after

it makes that brief, it would go ahead and shoot that off to us in

an email. As you can see, I'm just going to try one more before we

wrap up here. Okay. So I just shot off another one that says, create me

a daily newsletter that will search sources like Google and perplexity every day and try

to find new discounts on AI tools. If it finds good ones, it should shoot

me an email. Otherwise, it has to do nothing. But all of the results every

day should be logged in a Google Sheet. So I'll check back in with you

guys when this is done, and we'll take a look at the workflow. All right,

that just finished up. Let's go ahead and click into this link and see what

we got for this workflow. Okay, pretty cool. So we've got our daily

schedule. We've got our AI deal scout. And it also tells us what configuration is

still required and the logic flow. Now, keep in mind that the actual agent does

not have live data. It has static data about NN docs, which is why it

was probably unable to pull perplexity and SERP API or something like that. But it

does notavily, so it pulled that. And it tells us that we still have to

configure a few things here. Now, the agent itself has a user message saying, hey,

here's today's date, search the internet, find results. And if you did find deals, set

discounts found to true. And here's what you do if you don't find any. And

then in the system message, it says that you are a deal hunter bot. Your

goal is to save the user money on AI software. Be skeptical and only report

active verified offers. But from there, you can see it also has a structured output

parser. And it's outputting discounts found as true or false. And it's also going to

output the log summary. And if there are deals found, it will also output an

email subject and an email body. Then it's going to link all of that to

a Google sheet, which we would have to set up here. But it will submit

the date, the status, and the summary. And then based on that status, it will

check over here if deals were found. And if they were, it would actually send

us that email notification. All right, guys. So there is one more piece that I

felt obligated to tell you guys about. And that is when you want to use

an agent with Gemini 3 Pro. And you want that agent to be able to

call tools. It doesn't really quite work yet. And it does, but it doesn't. And

let me just show you what I mean by that. So if we're looking at

the documentation for the Gemini 3 API, we can see something here called thought signatures.

It says Gemini uses thought signatures to maintain reasoning context across API calls. These

signatures are encrypted representations of the model's internal thought process. To ensure that the model

maintains its reasoning capabilities, you must have these signatures be sent back to the model

with tool calling in order for it to actually work. So basically the flow here

is, let's say I ask this agent to send an email. It's going to go

to the chat model, understand what to do. It's going to go back to the

agent. It's going to go to the tool and send that email. And then all

of that's going to come back to the chat model. So it can respond to

us and say, I sent the email to X and here's what I said. And

so that's where we get the error is on that second time we go back

to the model. And I'll just show you guys what I mean by that. What

I'm going to do is send off this message that says, send an email to

Nate at example .com asking if he wants to get lunch this weekend. And what

you just saw happen is exactly what I said. The brain calls the tool, the

tool works correctly. So if we go to the email, we'll actually see that this

did get sent. But then what happens here is we get an error. And when

I open this up, it says bad request, error fetching from this URL, because function

call is missing a thought signature in the function call parts. This is required for

tools to work correctly. And missing thought signature may lead to degraded model performance. Now,

the issue is when we make this request to Gemini through this native integration, we

don't actually have the ability to add in that thought signature or change any of

those parameters. And the way that this documentation shows how we should be doing this

is, you know, if you went through Python or some other custom way that you

can send off these requests with tool calling, you would be able to add that

stuff in there. And so first I thought, okay, maybe it's just a limitation of

this Gemini node. Maybe we could do that through open router. I have my three

pro preview. I don't see anything in here about thought signatures, but let's just try

it again. So I'm going to shoot off that same message. We hit the model.

We're going to hit the tool. It's going to successfully send. And then we're going

to get another error over here once again. And this time it just says that

the provider returned an error. So it's the exact same error going on. And real

quick, just wanted to prove to you guys that all of these are being sent.

So right here, we're getting obviously a message because it's a fake email, but the

agent actually is sending the email about wanting to grab lunch. So we know that

the tool is working. It's just about the context of what just happened in this

tool being fed back into the model. So it can actually respond. So like I

said, I was doing research on this. I was reading through GitHub forms and NNN

forms. And it seems like right now, this is the simplest way to put it.

It's not working in NNN because the tool calling feature of Gemini 3 requires a

special thought signature field in each function call. But NNN's built -in nodes do not

add or keep this field when talking to Gemini. For it to work, NNN needs

to update its Gemini nodes or workflows so that we can actually include and pass

the thought signature each time the tool is called. And then also just real quick,

what is a thought signature? It's an encrypted code from Gemini that saves what the

AI was thinking when it made a function call. Then that data gets sent back

so it can once again communicate what it actually just did. And so I just

wanted to save you guys all the headache of if you've gone through all this

other stuff and you've played around with Gemini and you're loving it, and then you

decide, okay, I'm going to go ahead and plug in Gemini 3 Pro to all

of my other agents that I have that are calling tools. I don't come from

like a super technical background. So if I completely just messed up and misinformed everyone,

then please in the comments, if you know what's going on here and you know

how to fix it, let me know. But to the best of my ability, that's

what I know. So anyways, I don't want this video to go too long. But

if you guys want to play around with this exact workflow, so you can just

test out some things, then I will leave it in my FreeSchool community, which you

guys can go access and download for completely free. That'll be linked in the description.

And if you want to take your learning a little bit farther and be supported

by over 2500 members who are building with NNN and building businesses with NNN, then

check out the Plus community that will also be linked in the description. We've got

full four courses in here. We have Agent Zero as the foundations for beginners for

AI automation, 10 hours to 10 seconds where you learn how to identify, design, and

build time -saving automations. And then for our premium members, we have one -person AI

agency where you learn how to lay the foundation for a scalable AI automation business.

And then we have Subs to Sales, which is kind of my framework for how

I was able to grow a YouTube channel in the AI automation niche and use

that to power my businesses. We've also got one live call per week in here.

So I'd love to see you guys in those calls in the communities, but that's

going to do it for today. So I hope you guys enjoyed the video. If

you did, or if you learned something new, please give it a like. It definitely

helps me out a ton. And as always, I appreciate you guys making it to

the end of the video. I'll see you on the next one. Thanks so much,

everyone.

Build ANYTHING with Gemini 3 Pro and n8n AI Agents

Nate Herk | AI Automation

103 days ago

24:12

AI Automation & Agentic Workflows

Rank #6

Description

Full courses + unlimited support: https://www.skool.com/ai-automation-society-plus/about All my FREE resources: https://www.skool.com/ai-automation-society/about 14 day FREE n8n trial: https://n8n.partnerlinks.io/22crlu8afq5r Gemini 3 Pro is here, and the benchmarks are seriously impressive. In this video, I break down what makes Google’s new chat model stand out, where it performs best, and where it still falls short. I also run real tests inside n8n so you can see exactly how it behaves in automations and AI agent workflows. If you want to learn how to connect Gemini 3 Pro to n8n, when to use it, and how it can upgrade the systems you build, this video will walk you through everything step by step. Sponsorship Inquiries: 📧 sponsorships@nateherk.com TIMESTAMPS 00:00 What is Gemini 3 01:27 What We’re Covering Today 02:39 Gemini Model Family Overview 03:26 Gemini 3 Benchmarks 05:05 Connecting to Gemini 3 in n8n 09:55 1) Image Analysis 13:28 2) Context Window Reasoning 15:58 3) Gemini Building n8n Workflows 19:36 Current Limitations and Consideration 23:21 Want to Build an AI Automation Business?

Video Details

Category

AI Automation & Agentic Workflows

Featured Date

December 6, 2025

Quality Rank

#6

AI Recommended