Loading video player...
Today I'm super excited to share with
you guys this Photoshop AI agent that I
built in Nen with no code. It uses
Google's new nano banana image
generation model which is absolutely
insane. It's really changing the game
for ad creatives and UGC content. So I
don't want to waste any time. We're
going to dive into it and I'm going to
show you guys exactly how this system
works. And as always, I'm going to give
all the resources that you need to set
this up completely free. So stick around
so I can show you guys how to do that.
All right. So here is the master
Photoshop agent. And you can see it's
not too complex. Up front, what we have
is the ability to take text or image as
an input. And then the agent has five
different tools to choose from. It's got
two image genen tools where it can
combine images or edit existing images.
And then it has three file handling
tools in Google Drive where it can
change the name of a file, search
through raw files, or search through AI
generated images that it's created for
us. And it can manage all of this right
in your pocket, right through Telegram.
So let's hop into a demo. Okay, so you
can see that our workflow is listening
to us. I'm going to send a picture on
Telegram.
Okay, so I just shot that off and now
you can see it's uploading that to
Google Drive. The agent's going to see
that and then it's going to ask us what
we want to name that file in our Google
Drive. So there you go. It just
responded with what would you like me to
name that photo? And I'm shooting back a
message that says call it Nate. And what
it's going to do now is use this change
name tool to change the name in our
Google Drive to Nate. As you can see, if
I switch over real quick to media, it
just got called Nate. And originally it
would have been called today's date. So
now it's asking us what we want to do
next. And I'm going to shoot off this
picture of a bag of granola. And once
again, it's going to ask us what do we
want to name this file. And just to show
you guys real quick, in the media folder
that it's dropping, like I said, it
defaults to naming everything today's
date. So now I'm going to show you guys
that it actually does change it. All
right, so shooting off this message that
says to call that picture granola. And
once again, we'll see it go here and
change that picture to granola. And then
we'll have it combine the images
together. All right, that just finished
up. We'll check the media folder, and we
can see that picture got called granola.
Okay, so I've got Telegram open on my
desktop now just to make this easier.
And it's asking us what to do next. I'm
shooting off this message that says,
"Please combine the Nate and granola
pictures to make a photorealistic image
where the man is holding their granola
while hiking on a mountain." So, if you
remember over here, we have a raw file
of granola. We have a smiling selfie of
me. And it's going to put these
together. Right here, you can see it's
using its combined image tool, and it's
going to use nano banana to make it look
like I'm holding the granola on a
mountain. So, I'll check in with you
guys when that's finished up. Oh, and
one other thing. While this is running,
normally the agent would go search
through the files to make sure it has
the right IDs to combine the images. But
because it's using its memory, it
already knows that. So that's why it
didn't hit the other file handling
tools. But you can see we got a response
over here. It's combined the images.
Now, let me click into this Google Drive
link and we'll take a look. So there we
go. We've got the kind granola with all
the spelling correct. We've got me and
my face holding the granola while hiking
on a mountain. And just to show you guys
what that would look like if the files
weren't just immediately uploaded and we
wanted to pull different ones. So let's
see. We have a Hermosi picture right
here that's insanely low quality. We've
got a JBL speaker picture right here. So
what I'm going to do now is say please
combine the Hermosi image with the JBL
speaker image to make it look like the
man is listening to the speaker on a
boat. And now you can see it's searching
through the RAW files. It has those
correct IDs. And now that it has those
file IDs, it can pass that to the
combine images tool in order to actually
send those both to Nano Banana and get
back an image. There you go. You can see
it's calling the combined images tool.
So, I'll check in with you guys when we
get that back. All right, so it just
told us it was done. I'm going to go to
the AI image generation folder, and you
can see we do have a new one. Let's
click in to see how it turned out. Not
too bad at all. We have the JBL speaker.
We've got Hermoszi in his
acquisition.com beater, and it looks
pretty good. And remember, these images
are only going to get better and better
when you come in here and you customize
the prompts and you customize all this
other stuff because what's actually
going on is this agent is creating its
own prompt when it sends over variables
to this workflow. So, if we had a
dedicated agent just focused on like
creating optimized AI image generation
prompts, the results would probably be a
lot better. All right. All right. And
just to show off the functionality of
the edit image tool, I'm going to shoot
off this message that says to create a
photorealistic advertisement of the
granola image, make it look like it's
being held in front of the Eiffel Tower.
So, this one is going to have to find
the file ID of the granola image,
whether that's in the memory or if it
has to search the raw files. Looks like
it just searched the raw files as you
can see. And now it's going to call the
edit image tool in order to change the
appearance of that image. Okay, so looks
like that just finished up. I'm going to
go to the AI image generation folder.
You can see we have granola ad Eiffel.
If I click on it, you can see that it is
our bag of granola which looks exactly
like that and it is being held in front
of the Eiffel Tower. So, if that's not a
good ad creative, then I don't know what
is. And so, all of the words are pretty
much spot-on except for right up here at
the top. As you can see, this is
supposed to say ingredients you can see
and pronounce. But two things, the first
thing is that in the RAW files, this is
a pretty low quality image. So, like I
can't even read what that says right
there. And then the second thing is this
image generation model is only going to
get better and better and of course the
prompting that is in this current
workflow is very minimal. But anyways,
now that you guys have seen how this
works, let's start to dive into what's
going on with all of these nodes. So the
first thing I'm going to start off with
is the text to image input. What happens
here is we're using a switch basically
to detect if a photo exists or if text
exists. And if photo exists, we go up
this way where we download it, upload it
to Google Drive, and then we shoot that
off to the agent. But if text exists,
then we just shoot the text straight
away. But what we have to do is make
sure that the variables are the same so
that it's always coming through in a
field called JSON message.ext. So we
basically just standardize the input so
the agent looks at it no matter what
comes through. Naturally, the next thing
that we'll go over is the system prompt,
which is very, very minimal. I said,
"You are a personal assistant agent.
Your job is to use the tools you have
access to to help the user with their
request." I listed out all the tools and
I gave a very minimal description of
each of them because in the tools
themselves, there's a brief description,
but I still like to put them here. And
then for the instructions, I only ended
up giving it one instruction, which was
if the user submits a photo, ask what
they want to name it by saying, "What
would you like me to name that photo in
your Google Drive?" Then once they
respond, change the name using your
change name tool. The system prompt is
very minimal. We have only five tools to
choose from, and we're using GPD 5.1 and
it's doing really well. And from here,
what I would do is as I'm testing more
and as you guys download the template
and start to play around with different
things, just add instructions to your
system prompt when you realize it's
failing to do certain things. And then
real quick, just to hit on what's going
on over here before we dive into the
tools is we're using GBT 5.1. We're
using Sonnet 3.5 as the fallback model.
And then we're just using simple memory
with the session ID being the Telegram
chat ID. All right, so first I'm going
to go over the file handling tools
because they're super simple and we'll
just get that out the way. The first one
is change name. So if I click into this
tool, you can see what we're doing is
we're updating a file and we're updating
it by the ID and the ID is automatically
found by the agent. So if the agent
doesn't have the ID of a file to change
the name, it will have to go use its
other tools to find that ID and then it
will pass the ID to this tool. And the
other thing that gets passed to this
tool is the new name. So when we say,
hey, call that photo Nate, it knows to
fill this in right here with the word
Nate. And then these tools are doing the
exact same thing. The only difference is
the folder that they're searching in. So
if I say to search for raw files, what
we're doing is we're searching within a
folder called media and we're searching
for files. But if the agent realizes
that we want to search through an image
that it already created for us, it would
use this one which is search AI images
and it's doing the exact same thing
except for we change the folder that
it's looking through. So like I said,
those three are super simple, but those
are necessary in order to change things,
look at things, and search through the
database. Okay, so now let's get into
the fun stuff which are these two custom
workflows that I built. And what that
means is if I open up this workflow, you
can see that it is a custom Nadin
workflow that I built myself. We're able
to make our main agent call on it
because you can see that if we add a
tool right here, we have an option to
call an Nin workflow as a tool. So
that's how we can create these really
modular systems because now if I ever
want to create another agent that can
combine images, I can just plug it into
this workflow right here. So anyways,
there's a lot of stuff going on here. So
let's just take it node by node. The
first thing that's going on is we have
the input where I'm defining to the main
Photoshop agent what to send over. So
you can see we're sending over an image
prompt, image one, which is an image ID,
image two, which is the image ID of the
second image, and then the image title
for the new image that's created. And
that may seem a little confusing, but
when we go back to the main Photoshop
agent, it looks at this and it says,
"Okay, when I need to call this tool,
which is combining images, I need to
send over a prompt, image one, image
two, and image title." So, this is how
it knows to use its brain to give the
second workflow all of this information.
And to make this easier, I pulled in the
live data from that run we did in the
demo, just so you guys can see that what
comes in is here's the image with
Hormosi in the speaker. Here's the ID of
the first one, the ID of the second one,
and the new name of the AI generated
image. So from there, what we do is we
edit fields just to create an array of
the two image IDs. We have to create an
array so that we can split these out
into two separate items because we need
to download two files and we need to get
a public URL of both of those files. So
in the Google Drive node, we're
downloading by ID. So you can see we're
getting both of these pictures here.
There's the Hormosi one and there is the
speaker one. And then the way that the
nano banana API works, it has to take a
public URL of an image in order to
actually change it. So what we do is we
use a free service called image BB and
I'm basically just uploading these
images as binary and then it gives us
back a public URL that represents the
image. So here you can see this is what
we get back and if I open that up, it is
the picture of Herozi.
So that's just a cool little workaround
if you need to get an binary image into
a public URL. There's other ways to do
it, but this is just a free easy way
that I do it. So once we have both those
URLs back, we aggregate them so that we
can make one API request rather than
two. So we're making one request to Nano
Banana through a service called FAL AI.
So I'm not going to dive super super
deep into what's going on here. If you
want to watch an API video I made, I'll
tag that right up here. But what's going
on is we're using our foul credentials.
So I got my API key from foul.ai. And
then this JSON body request is really
simple. We're passing over a prompt and
we're passing over two image URLs. And
the only reason why this prompt looks a
little confusing is because I put these
two replace functions in there that
basically make sure if the prompt has
new lines or um double quotation marks
that it gets rid of that because that
would break the JSON body request. So
now it has everything it needs. It has
both the images and it has the prompt.
And then it basically says, "Okay, I
received your request. We're working on
that right now." And because it's
working on it, we wait for about 10
seconds and then we make a request back
to foul to see if it's done. And if it's
not done, it will come here and it will
wait for 30 seconds. And honestly, this
should probably change to like four
because images are really quick. And
then it will just continuously check
until it's done. So that's just another
cool guard rail you can have in your
workflows. And then once it gets that
result back, it basically gives us a
URL. And so what I do is I make a simple
git request to that URL to actually get
the image itself. And now we have the
image as binary. We can upload it to
Google Drive. And then we can set our
response to the main agent which is
basically saying, "Hey, the image was
created and this is what we named it and
here's the link to that image in your
Google Drive." And then the main agent
gets that and responds to us, the user.
So hopefully that wasn't too confusing
and at a high level you can understand
what's going on. Remember this will be a
free workflow you guys can download in
my free school community as well. So the
best way to really understand it is to
download all the assets, run it, and
then go node by node and understand
what's going on. And real quick, the way
you would actually get all this is you
join my free school community. The link
for that down in the description. When
you get there, this is what it will look
like. And all you need to do is go to
YouTube resources and find the post
associated with this video. If you're
having trouble finding it, you can also
use the search bar to search for the
title of the video. But then once you
get in there, you'll see the video and
then you'll also see all of these JSON
files that you can download and then
import directly into your NAD. And then
when you get all the stuff set up, there
will be a big sticky note right here
called a setup guide and it will tell
you exactly what you need to do to get
up and running. Okay, cool. So that was
the combine images node where we used
nanobanana. Now what's cool is the edit
image node is very very similar. So if I
open this up and I actually just click
on view subexecution, we can see exactly
what happened when it was called. And
it's very similar to the previous one.
Literally the only difference is that
instead of passing in two image IDs,
we're only passing in one. So what
happens once again is the input is image
title, image prompt, and image ID. So
here's the run from the granola Eiffel
Tower ad. As you can see, we're going to
go to Google Drive once again, pass in
that image ID so we can download the
file. Then what comes next is we are
once again sending that binary data to
imageB to get a public URL. So right
here you can see this is the public URL
we get for our granola picture.
And then now that we have that we can
make another request to FAI to use
Nanobanana where we send over the prompt
and then just one image URL instead of
two. So it's very very similar. We're
using the same guardrails in here to get
rid of new lines and new spaces and um
quotation marks, all that kind of stuff.
And then it basically just edits that
image. We're doing the exact same thing
here with a polling check. And then when
we get the result, we download it as
binary rather than keeping it as a URL,
put it in our Google Drive, and then we
send a response back to the main agent.
Real quick, just wanted to touch on some
of the pricing for this system. The
first thing that I wanted to cover was
foul.ai.
What foul is is like a place where you
can have a ton of different image and
video generation models and you can just
use one credential and, you know, get
all of them through there. So, it's
really cool. You can see right here my
recently used our Nano Banana Edit.
We've got V3 fast, V3. You can go to
explore and you can see all of these
other models that they have available.
But anyways, the reason why I wanted to
show you guys this is because first of
all, it's only about 4 cents per image,
which is not too bad at all. But what
you can do is you can play around with
like image URLs and prompts here. So you
can really refine the way you want to
have your prompting before you get into
Nen and start messing around there. And
then the way that we can call it through
nen is by going to their API
documentation. I would change this to
HTTP curl. And now you can look through
like how to get your API key, how to set
that up, how to submit a request, how to
upload files, all this kind of stuff.
And I also did see about a week ago that
you could get free image generation
through Open Router for Gemini Nano
Banana because if you go to the Gemini
web app, it's free to try out there. And
so for a while it was free here on Open
Router, but it looks like they might
have just taken that off. So
unfortunate, but once again, you can get
25 images for only a dollar. So it's not
too bad. Okay, so that's basically all
that's going on in this workflow. Now,
let's talk about a few ways that when
you get this template that you could
customize this and make it a little more
production ready. So, something that I
alluded to earlier was in these two
tools where you're doing image
generation would be to have a dedicated
AI that would create the system prompts
or the AI image generation prompts. So
what you could do is like literally
right here just add an AI agent node and
this one is prompted in a way where it's
specializing in creating AI image
generated prompts and then you pass that
prompt all the way down to the create
image node rather than relying on this
main agent who has tons of other jobs to
do relying on him to make the image
prompt. It would be better to have a
second agent in this workflow that would
specialize in that. There's of course
other things you can do like having a
logger. So in a previous agent I've made
that was kind of similar. This one was
an ultimate media team. What I did here
was I had it returning all of its steps
into a Google sheet, whether it errored
or it was successful. So you could see
exactly what input was processed, what
tools were called, how many tokens that
was taking you. So you could use this
video, which I will tag right up here if
you haven't seen it. You can download
those resources and do the exact same
thing in this agent where you can have a
Google sheet that's logging everything.
And then of course another really cool
next step would be being able to take
these AI generated images and pass that
to another workflow that can create
videos out of them and you could use
like V3 fast to create those images for
you. That's another thing that I did in
this media agent where I have like an
image tovide tool, a create video tool.
So definitely check out that video as
well if you want to literally use some
of these workflows and then just hook
them up to your Photoshop agent and
because that's the beauty of these
custom workflows as a tool. you can have
them hooked up to as many different
agents as you want. So once again, all
the resources for this video and every
single other YouTube video you've seen
on my channel, you can get for
completely free in my free school
community, you just have to come here,
look through the YouTube resources, and
as you can see, every single video has
all of the resources right here. And if
you're looking to take your skills a
little further with NINDN and you're
also looking to understand how you can
monetize your AI automation knowledge,
then definitely check out my plus
community. The link for that will also
be down in the description. We've got
thousands of members in here who are
building and selling end services every
single day and they're always sharing
their learnings and their challenges.
It's a really cool space to be. We've
also got two full courses at the moment
with a third one coming about monetizing
your AI skills. But if you're a complete
beginner, you can start with the
foundations and then you get in here and
you master nen and then you learn how to
start selling or consulting with this
knowledge. So, I'd love to see you guys
in these communities. But that's going
to do it for the video. If you enjoyed
or you learned something new, please
give it a like. It definitely helps me
out a ton. And as always, I appreciate
you guys making it to the end of the
video. I'll see you on the next one.
Thanks everyone.
Full courses + unlimited support: https://www.skool.com/ai-automation-society-plus/about All my FREE resources: https://www.skool.com/ai-automation-society/about Have us build agents for you: https://truehorizon.ai/ 14 day FREE n8n trial: https://n8n.partnerlinks.io/22crlu8afq5r In this video, I show you how I built a no-code Photoshop AI agent inside n8n that can combine and edit images, search through all your files for raw content or generated images, and even rename them automatically. Powered by Google’s new state-of-the-art Nano Banana image model, this agent is a game-changer for ad creatives and UGC content, making high-quality image editing and management faster and easier than ever. As always, you can get all the free resources you need to copy this exact system inside my Free Skool Community. Sponsorship Inquiries: 📧 sponsorships@nateherk.com TIMESTAMPS 00:00 Agent Overview 00:49 Live Demo 05:02 AI Agent Configuration 06:39 File Handling Tools 07:39 AI Image Tools 13:31 Image Gen Pricing 14:46 Improving This Template 16:28 Get the FREE Resources 16:44 Want to Master/Monetize n8n? Gear I Used: Camera: Razer Kiyo Pro Microphone: Blue Yeti USB