I Built a Photoshop AI Agent in n8n with no code (NanoBanana) | DailyDevLists

Loading video player...

Full Transcript

4,273 words • EN

Today I'm super excited to share with

you guys this Photoshop AI agent that I

built in Nen with no code. It uses

Google's new nano banana image

generation model which is absolutely

insane. It's really changing the game

for ad creatives and UGC content. So I

don't want to waste any time. We're

going to dive into it and I'm going to

show you guys exactly how this system

works. And as always, I'm going to give

all the resources that you need to set

this up completely free. So stick around

so I can show you guys how to do that.

All right. So here is the master

Photoshop agent. And you can see it's

not too complex. Up front, what we have

is the ability to take text or image as

an input. And then the agent has five

different tools to choose from. It's got

two image genen tools where it can

combine images or edit existing images.

And then it has three file handling

tools in Google Drive where it can

change the name of a file, search

through raw files, or search through AI

generated images that it's created for

us. And it can manage all of this right

in your pocket, right through Telegram.

So let's hop into a demo. Okay, so you

can see that our workflow is listening

to us. I'm going to send a picture on

Telegram.

Okay, so I just shot that off and now

you can see it's uploading that to

Google Drive. The agent's going to see

that and then it's going to ask us what

we want to name that file in our Google

Drive. So there you go. It just

responded with what would you like me to

name that photo? And I'm shooting back a

message that says call it Nate. And what

it's going to do now is use this change

name tool to change the name in our

Google Drive to Nate. As you can see, if

I switch over real quick to media, it

just got called Nate. And originally it

would have been called today's date. So

now it's asking us what we want to do

next. And I'm going to shoot off this

picture of a bag of granola. And once

again, it's going to ask us what do we

want to name this file. And just to show

you guys real quick, in the media folder

that it's dropping, like I said, it

defaults to naming everything today's

date. So now I'm going to show you guys

that it actually does change it. All

right, so shooting off this message that

says to call that picture granola. And

once again, we'll see it go here and

change that picture to granola. And then

we'll have it combine the images

together. All right, that just finished

up. We'll check the media folder, and we

can see that picture got called granola.

Okay, so I've got Telegram open on my

desktop now just to make this easier.

And it's asking us what to do next. I'm

shooting off this message that says,

"Please combine the Nate and granola

pictures to make a photorealistic image

where the man is holding their granola

while hiking on a mountain." So, if you

remember over here, we have a raw file

of granola. We have a smiling selfie of

me. And it's going to put these

together. Right here, you can see it's

using its combined image tool, and it's

going to use nano banana to make it look

like I'm holding the granola on a

mountain. So, I'll check in with you

guys when that's finished up. Oh, and

one other thing. While this is running,

normally the agent would go search

through the files to make sure it has

the right IDs to combine the images. But

because it's using its memory, it

already knows that. So that's why it

didn't hit the other file handling

tools. But you can see we got a response

over here. It's combined the images.

Now, let me click into this Google Drive

link and we'll take a look. So there we

go. We've got the kind granola with all

the spelling correct. We've got me and

my face holding the granola while hiking

on a mountain. And just to show you guys

what that would look like if the files

weren't just immediately uploaded and we

wanted to pull different ones. So let's

see. We have a Hermosi picture right

here that's insanely low quality. We've

got a JBL speaker picture right here. So

what I'm going to do now is say please

combine the Hermosi image with the JBL

speaker image to make it look like the

man is listening to the speaker on a

boat. And now you can see it's searching

through the RAW files. It has those

correct IDs. And now that it has those

file IDs, it can pass that to the

combine images tool in order to actually

send those both to Nano Banana and get

back an image. There you go. You can see

it's calling the combined images tool.

So, I'll check in with you guys when we

get that back. All right, so it just

told us it was done. I'm going to go to

the AI image generation folder, and you

can see we do have a new one. Let's

click in to see how it turned out. Not

too bad at all. We have the JBL speaker.

We've got Hermoszi in his

acquisition.com beater, and it looks

pretty good. And remember, these images

are only going to get better and better

when you come in here and you customize

the prompts and you customize all this

other stuff because what's actually

going on is this agent is creating its

own prompt when it sends over variables

to this workflow. So, if we had a

dedicated agent just focused on like

creating optimized AI image generation

prompts, the results would probably be a

lot better. All right. All right. And

just to show off the functionality of

the edit image tool, I'm going to shoot

off this message that says to create a

photorealistic advertisement of the

granola image, make it look like it's

being held in front of the Eiffel Tower.

So, this one is going to have to find

the file ID of the granola image,

whether that's in the memory or if it

has to search the raw files. Looks like

it just searched the raw files as you

can see. And now it's going to call the

edit image tool in order to change the

appearance of that image. Okay, so looks

like that just finished up. I'm going to

go to the AI image generation folder.

You can see we have granola ad Eiffel.

If I click on it, you can see that it is

our bag of granola which looks exactly

like that and it is being held in front

of the Eiffel Tower. So, if that's not a

good ad creative, then I don't know what

is. And so, all of the words are pretty

much spot-on except for right up here at

the top. As you can see, this is

supposed to say ingredients you can see

and pronounce. But two things, the first

thing is that in the RAW files, this is

a pretty low quality image. So, like I

can't even read what that says right

there. And then the second thing is this

image generation model is only going to

get better and better and of course the

prompting that is in this current

workflow is very minimal. But anyways,

now that you guys have seen how this

works, let's start to dive into what's

going on with all of these nodes. So the

first thing I'm going to start off with

is the text to image input. What happens

here is we're using a switch basically

to detect if a photo exists or if text

exists. And if photo exists, we go up

this way where we download it, upload it

to Google Drive, and then we shoot that

off to the agent. But if text exists,

then we just shoot the text straight

away. But what we have to do is make

sure that the variables are the same so

that it's always coming through in a

field called JSON message.ext. So we

basically just standardize the input so

the agent looks at it no matter what

comes through. Naturally, the next thing

that we'll go over is the system prompt,

which is very, very minimal. I said,

"You are a personal assistant agent.

Your job is to use the tools you have

access to to help the user with their

request." I listed out all the tools and

I gave a very minimal description of

each of them because in the tools

themselves, there's a brief description,

but I still like to put them here. And

then for the instructions, I only ended

up giving it one instruction, which was

if the user submits a photo, ask what

they want to name it by saying, "What

would you like me to name that photo in

your Google Drive?" Then once they

respond, change the name using your

change name tool. The system prompt is

very minimal. We have only five tools to

choose from, and we're using GPD 5.1 and

it's doing really well. And from here,

what I would do is as I'm testing more

and as you guys download the template

and start to play around with different

things, just add instructions to your

system prompt when you realize it's

failing to do certain things. And then

real quick, just to hit on what's going

on over here before we dive into the

tools is we're using GBT 5.1. We're

using Sonnet 3.5 as the fallback model.

And then we're just using simple memory

with the session ID being the Telegram

chat ID. All right, so first I'm going

to go over the file handling tools

because they're super simple and we'll

just get that out the way. The first one

is change name. So if I click into this

tool, you can see what we're doing is

we're updating a file and we're updating

it by the ID and the ID is automatically

found by the agent. So if the agent

doesn't have the ID of a file to change

the name, it will have to go use its

other tools to find that ID and then it

will pass the ID to this tool. And the

other thing that gets passed to this

tool is the new name. So when we say,

hey, call that photo Nate, it knows to

fill this in right here with the word

Nate. And then these tools are doing the

exact same thing. The only difference is

the folder that they're searching in. So

if I say to search for raw files, what

we're doing is we're searching within a

folder called media and we're searching

for files. But if the agent realizes

that we want to search through an image

that it already created for us, it would

use this one which is search AI images

and it's doing the exact same thing

except for we change the folder that

it's looking through. So like I said,

those three are super simple, but those

are necessary in order to change things,

look at things, and search through the

database. Okay, so now let's get into

the fun stuff which are these two custom

workflows that I built. And what that

means is if I open up this workflow, you

can see that it is a custom Nadin

workflow that I built myself. We're able

to make our main agent call on it

because you can see that if we add a

tool right here, we have an option to

call an Nin workflow as a tool. So

that's how we can create these really

modular systems because now if I ever

want to create another agent that can

combine images, I can just plug it into

this workflow right here. So anyways,

there's a lot of stuff going on here. So

let's just take it node by node. The

first thing that's going on is we have

the input where I'm defining to the main

Photoshop agent what to send over. So

you can see we're sending over an image

prompt, image one, which is an image ID,

image two, which is the image ID of the

second image, and then the image title

for the new image that's created. And

that may seem a little confusing, but

when we go back to the main Photoshop

agent, it looks at this and it says,

"Okay, when I need to call this tool,

which is combining images, I need to

send over a prompt, image one, image

two, and image title." So, this is how

it knows to use its brain to give the

second workflow all of this information.

And to make this easier, I pulled in the

live data from that run we did in the

demo, just so you guys can see that what

comes in is here's the image with

Hormosi in the speaker. Here's the ID of

the first one, the ID of the second one,

and the new name of the AI generated

image. So from there, what we do is we

edit fields just to create an array of

the two image IDs. We have to create an

array so that we can split these out

into two separate items because we need

to download two files and we need to get

a public URL of both of those files. So

in the Google Drive node, we're

downloading by ID. So you can see we're

getting both of these pictures here.

There's the Hormosi one and there is the

speaker one. And then the way that the

nano banana API works, it has to take a

public URL of an image in order to

actually change it. So what we do is we

use a free service called image BB and

I'm basically just uploading these

images as binary and then it gives us

back a public URL that represents the

image. So here you can see this is what

we get back and if I open that up, it is

the picture of Herozi.

So that's just a cool little workaround

if you need to get an binary image into

a public URL. There's other ways to do

it, but this is just a free easy way

that I do it. So once we have both those

URLs back, we aggregate them so that we

can make one API request rather than

two. So we're making one request to Nano

Banana through a service called FAL AI.

So I'm not going to dive super super

deep into what's going on here. If you

want to watch an API video I made, I'll

tag that right up here. But what's going

on is we're using our foul credentials.

So I got my API key from foul.ai. And

then this JSON body request is really

simple. We're passing over a prompt and

we're passing over two image URLs. And

the only reason why this prompt looks a

little confusing is because I put these

two replace functions in there that

basically make sure if the prompt has

new lines or um double quotation marks

that it gets rid of that because that

would break the JSON body request. So

now it has everything it needs. It has

both the images and it has the prompt.

And then it basically says, "Okay, I

received your request. We're working on

that right now." And because it's

working on it, we wait for about 10

seconds and then we make a request back

to foul to see if it's done. And if it's

not done, it will come here and it will

wait for 30 seconds. And honestly, this

should probably change to like four

because images are really quick. And

then it will just continuously check

until it's done. So that's just another

cool guard rail you can have in your

workflows. And then once it gets that

result back, it basically gives us a

URL. And so what I do is I make a simple

git request to that URL to actually get

the image itself. And now we have the

image as binary. We can upload it to

Google Drive. And then we can set our

response to the main agent which is

basically saying, "Hey, the image was

created and this is what we named it and

here's the link to that image in your

Google Drive." And then the main agent

gets that and responds to us, the user.

So hopefully that wasn't too confusing

and at a high level you can understand

what's going on. Remember this will be a

free workflow you guys can download in

my free school community as well. So the

best way to really understand it is to

download all the assets, run it, and

then go node by node and understand

what's going on. And real quick, the way

you would actually get all this is you

join my free school community. The link

for that down in the description. When

you get there, this is what it will look

like. And all you need to do is go to

YouTube resources and find the post

associated with this video. If you're

having trouble finding it, you can also

use the search bar to search for the

title of the video. But then once you

get in there, you'll see the video and

then you'll also see all of these JSON

files that you can download and then

import directly into your NAD. And then

when you get all the stuff set up, there

will be a big sticky note right here

called a setup guide and it will tell

you exactly what you need to do to get

up and running. Okay, cool. So that was

the combine images node where we used

nanobanana. Now what's cool is the edit

image node is very very similar. So if I

open this up and I actually just click

on view subexecution, we can see exactly

what happened when it was called. And

it's very similar to the previous one.

Literally the only difference is that

instead of passing in two image IDs,

we're only passing in one. So what

happens once again is the input is image

title, image prompt, and image ID. So

here's the run from the granola Eiffel

Tower ad. As you can see, we're going to

go to Google Drive once again, pass in

that image ID so we can download the

file. Then what comes next is we are

once again sending that binary data to

imageB to get a public URL. So right

here you can see this is the public URL

we get for our granola picture.

And then now that we have that we can

make another request to FAI to use

Nanobanana where we send over the prompt

and then just one image URL instead of

two. So it's very very similar. We're

using the same guardrails in here to get

rid of new lines and new spaces and um

quotation marks, all that kind of stuff.

And then it basically just edits that

image. We're doing the exact same thing

here with a polling check. And then when

we get the result, we download it as

binary rather than keeping it as a URL,

put it in our Google Drive, and then we

send a response back to the main agent.

Real quick, just wanted to touch on some

of the pricing for this system. The

first thing that I wanted to cover was

foul.ai.

What foul is is like a place where you

can have a ton of different image and

video generation models and you can just

use one credential and, you know, get

all of them through there. So, it's

really cool. You can see right here my

recently used our Nano Banana Edit.

We've got V3 fast, V3. You can go to

explore and you can see all of these

other models that they have available.

But anyways, the reason why I wanted to

show you guys this is because first of

all, it's only about 4 cents per image,

which is not too bad at all. But what

you can do is you can play around with

like image URLs and prompts here. So you

can really refine the way you want to

have your prompting before you get into

Nen and start messing around there. And

then the way that we can call it through

nen is by going to their API

documentation. I would change this to

HTTP curl. And now you can look through

like how to get your API key, how to set

that up, how to submit a request, how to

upload files, all this kind of stuff.

And I also did see about a week ago that

you could get free image generation

through Open Router for Gemini Nano

Banana because if you go to the Gemini

web app, it's free to try out there. And

so for a while it was free here on Open

Router, but it looks like they might

have just taken that off. So

unfortunate, but once again, you can get

25 images for only a dollar. So it's not

too bad. Okay, so that's basically all

that's going on in this workflow. Now,

let's talk about a few ways that when

you get this template that you could

customize this and make it a little more

production ready. So, something that I

alluded to earlier was in these two

tools where you're doing image

generation would be to have a dedicated

AI that would create the system prompts

or the AI image generation prompts. So

what you could do is like literally

right here just add an AI agent node and

this one is prompted in a way where it's

specializing in creating AI image

generated prompts and then you pass that

prompt all the way down to the create

image node rather than relying on this

main agent who has tons of other jobs to

do relying on him to make the image

prompt. It would be better to have a

second agent in this workflow that would

specialize in that. There's of course

other things you can do like having a

logger. So in a previous agent I've made

that was kind of similar. This one was

an ultimate media team. What I did here

was I had it returning all of its steps

into a Google sheet, whether it errored

or it was successful. So you could see

exactly what input was processed, what

tools were called, how many tokens that

was taking you. So you could use this

video, which I will tag right up here if

you haven't seen it. You can download

those resources and do the exact same

thing in this agent where you can have a

Google sheet that's logging everything.

And then of course another really cool

next step would be being able to take

these AI generated images and pass that

to another workflow that can create

videos out of them and you could use

like V3 fast to create those images for

you. That's another thing that I did in

this media agent where I have like an

image tovide tool, a create video tool.

So definitely check out that video as

well if you want to literally use some

of these workflows and then just hook

them up to your Photoshop agent and

because that's the beauty of these

custom workflows as a tool. you can have

them hooked up to as many different

agents as you want. So once again, all

the resources for this video and every

single other YouTube video you've seen

on my channel, you can get for

completely free in my free school

community, you just have to come here,

look through the YouTube resources, and

as you can see, every single video has

all of the resources right here. And if

you're looking to take your skills a

little further with NINDN and you're

also looking to understand how you can

monetize your AI automation knowledge,

then definitely check out my plus

community. The link for that will also

be down in the description. We've got

thousands of members in here who are

building and selling end services every

single day and they're always sharing

their learnings and their challenges.

It's a really cool space to be. We've

also got two full courses at the moment

with a third one coming about monetizing

your AI skills. But if you're a complete

beginner, you can start with the

foundations and then you get in here and

you master nen and then you learn how to

start selling or consulting with this

knowledge. So, I'd love to see you guys

in these communities. But that's going

to do it for the video. If you enjoyed

or you learned something new, please

give it a like. It definitely helps me

out a ton. And as always, I appreciate

you guys making it to the end of the

video. I'll see you on the next one.

Thanks everyone.

I Built a Photoshop AI Agent in n8n with no code (NanoBanana)

Nate Herk | AI Automation

72 days ago

17:25

AI Automation & Agentic Workflows

Rank #5

Description

Full courses + unlimited support: https://www.skool.com/ai-automation-society-plus/about All my FREE resources: https://www.skool.com/ai-automation-society/about Have us build agents for you: https://truehorizon.ai/ 14 day FREE n8n trial: https://n8n.partnerlinks.io/22crlu8afq5r In this video, I show you how I built a no-code Photoshop AI agent inside n8n that can combine and edit images, search through all your files for raw content or generated images, and even rename them automatically. Powered by Google’s new state-of-the-art Nano Banana image model, this agent is a game-changer for ad creatives and UGC content, making high-quality image editing and management faster and easier than ever. As always, you can get all the free resources you need to copy this exact system inside my Free Skool Community. Sponsorship Inquiries: 📧 sponsorships@nateherk.com TIMESTAMPS 00:00 Agent Overview 00:49 Live Demo 05:02 AI Agent Configuration 06:39 File Handling Tools 07:39 AI Image Tools 13:31 Image Gen Pricing 14:46 Improving This Template 16:28 Get the FREE Resources 16:44 Want to Master/Monetize n8n? Gear I Used: Camera: Razer Kiyo Pro Microphone: Blue Yeti USB

Video Details

Category

AI Automation & Agentic Workflows

Featured Date

November 13, 2025

Quality Rank

#5

AI Recommended