I Let Claude Code Build an App for 24 Hours | DailyDevLists

Loading video player...

Full Transcript

4,147 words • EN

So recently Anthropic released this

article called effective harnesses for

longrunning agents. And basically it

walks you through how you can take

clawed code and have it run for a very

long time and build out an entire

application. And the main shortcomings

that they come up with in this article

is that longunning agents that basically

continuously iterate on your project end

up causing a lot of issues. They end up

breaking existing functionality as the

project gets larger. They lose context

as to what's already been implemented

and what they need to implement next.

And often it's just very problematic.

And so they put a lot of time into

trying to understand a better way to

make a longunning agent to basically

build out a full stack application from

a highle uh prompt. For example, build

me an application where I can use AI LLM

to generate images. A prompt is simple

as that and then basically build out an

entire application with hundreds of

features. Now through reading this I was

kind of skeptical cuz I have been coding

with LM for a while and I know the

shortcomings as well. Like it's not

perfect. You do have to babysit a lot.

But they do have this repo that I

downloaded and I decided to just pull it

in, clone it, you know, get Python set

up and they kind of walk you through

that process and then I was able to run

it for 24 hours on my own computer just

to kind of verify is this article

actually onto something or is this just

a bunch of baloney. So let me demo what

this managed to build out in 24 hours.

Now I basically started with a single

prompt. I want like a a dashboard for

modifying a bunch of images and using AI

to generate stuff. I can't remember

exactly what my prompt was, but that's

basically what it was. So, let me log in

and kind of demo some of the key

features. It'll be too long to actually

demo everything because there's a lot of

features, but here is the overview and

we can go to the canvas to start

generating new images, right? So, I'm

going to go over here and I'm going to

say a giant dog walking through a city

and we can go ahead and select whatever

aspect ratio that we want. Maybe I'll

make it more like uh, you know,

vertical. You can choose the model. I'll

do Flux Pro because it's pretty nice.

And then you can go over here and change

how many steps I have. These modifiers

I'll say like cinematic.

I'll say depth of field. How about that?

And we'll just go ahead and generate

this image and see what happens. And

again, you can see down here, here are

all the other images that I've been

generating. And you can click on them.

You can go to a preview. You can zoom

in. You can go over here. You can

actually like create a variation of it.

You can remix it with a new prompt. You

can upscale. You can download it, which

brings it to your computer. Let me

create a variation of this blue chicken

as well. And you'll see that start

getting kicked off in my queue. Let's go

back to the the giant dog walking

through uh the city. Okay. So, with

this, what I can do is I'm going to go

back and I'm going to go to image to

video and I'm going to drag this in. I'm

going to make it short so it doesn't

cost me a bunch of money. Uh, a dog

walking towards a camera and I'm going

to use WAN 2.5 for this model. And then

we are going to just go ahead and keep

it vertical as well because I think

that's the original one that I did. It

looks like I failed to do that, but I

will show you that I basically took this

and I did image the video with this

chicken.

Okay. And then if this worked, we would

have had like an animated video of this

dog walking. I don't know why it's

broken. Again, having an agent run for

24 hours on your computer. It's going to

have bugs still, but it got a lot of

stuff done. I can go to my gallery. I

can view all my images. I can go in. I

can filter through them. I can click

into them like I showed you. I can go to

a collection and I can add various

things to my collection. So, let's just

add that wolf or that dog back to this

collection, which gets added down here.

I can set it as my cover. I'll go back

to collections. It shows up. I can go to

my projects. I can add images to this. I

can then share this project with other

people. So if people wanted to actually

like download these images, I can share

them um on the individual images

themselves. I can go and I can share the

im individual images. I'll create a

share link and then later I can go back

and I can delete that share link so

people can't actually download it

anymore. Uh there's also the ability to

mark things as trash and then that gets

cleaned up periodically after 30 days or

I can just empty out the trash bag the

trash bin. There's also this batch tool.

I haven't dived into this yet. I don't

know even what the purpose of it is, but

it basically just figured out all these

features and it started implementing

these and then I can view these

different models like some of the images

that they created. I can click use this

model and that'll take me well that's a

bug. It should take me to the actual

canvas page. We got a settings page with

all these different user preferences and

we have we have a credits page with like

usage. I can go and buy more credits. So

that's a quick overview of some of the

features. Again, it's not 100% perfect.

There are bugs and you do have to come

through and you have to kind of fix

these bugs manually. But the most

amazing thing about this is I literally

kicked this thing off and then I went to

bed and came back and it had like 30

more features added and then I kicked it

off again because sometimes it does

crash. I kicked it off again and I went

out shopping for a couple hours. I came

back and checked it and it had like 10

more features added. So I'm going to

show you how I got this little repo set

up locally to build out basically this.

And I do have a more deeper dive that

I'm going to have on my course which by

the way my course is now live. If you

want to go check out my

agenticjumpstart.com

course, I have a bunch of videos t

talking about how to do prompt

engineering, context engineering, how to

do agentic coding, I have almost 11

hours of content right now. I'm still

adding more as this, you know, the AI

basically progresses. I'm going to keep

on adding more videos and modules. So, I

think it's a very very high value if you

want to learn more about Opus 4.5,

UPD5.1 codecs, cursor, etc. Here are

some of the uh actual videos that I

have. So, like we have spec driven

development, I kind of talk about that.

MCPs, agentic mindset, and then I walk

you through how to set up cursor. I walk

you through how to set up cloud code and

the most important part in my opinion is

I show you my actual workflow of

building out a full stack web

application. So we're using tan stack

start with shad CN with drizzle for the

OM with Postgress and I basically build

out an entire application using the

exact workflow I use when I do agent

coding. And I think that's the most

valuable part. And then I also just

throw in some bonus videos as well. And

then I had this autonomous coding

section that I'm going to start building

out because I do think autonomous coding

is going to be much more powerful soon.

So yeah, go check out my course

agentjumpstart.com. We got a free

community, a lot of people, and I do

plan to keep building it out. So let's

check out this repo. I basically went

and I cloned the repo. And if I go to my

code over here, you can see I have it

already cloned out. And once you clone

it, there is an autonomous coding

folder. So you can go into this folder

and also you can load up the read me if

you want. And they kind of walk you

through how to get this set up. You have

to have the claw code anthropic CLI tool

installed. You have to have this uh

requirements file installed. You have to

have Python set up and pip set up. But

once you have all that stuff set up, you

can actually export an Enthropic API

key. There's also a way to set up with

your claude code ooth key, which I

actually did in this project. So I can

kind of walk you through how I did that.

So then when you have it all set up, you

can run a command to basically kick it

off. But before you do that, you can go

into prompts here and then there is an

appspec that you can kind of modify. So

this is where I defined the app I wanted

built out. I said an AI image design

studio creative AI imager image

generation platform and then if you

scroll down you can see that it lists

out a highle design and goal of like

what we're using. So this case it says

next JS16 tailwind CSS tan stack query

we have forms with react hook form shad

for the component library the back end

is using postgrass with drizzle OM with

file.ai AI for the actual image

generation all the models file storage

with S3 compatible better off for the

authentication

um deployment with docker compose and

basically it just gives this a good

guideline of everything that the LLM is

going to need like cloud code reads this

every single time to verify that it's on

the right path to build out the project

kind of like what I showed over here and

then you have like some core features

that document the landing page

authentication dashboard image

generation and you have all this like

look how long this file is this is a

huge file. Now, I didn't type this out

by hand. If you clone this repo, they

have an app spec that kind of describes

how to build a claw code clone. All I

did was I went here to Gemini 3. It's a

really good model for basically doing,

you know, longer writing or

documentation. I just went here and I

said, "Hey, refactored this entire thing

to actually just be an AI image

generation studio application." That's

literally all I did. And I I clicked

submit. And then it came through and it

refactored this entire app spec, all

like 10,000 lines of it to be geared

towards an app uh an AI generation

studio application. And this is the core

thing that it builds everything off of.

So keep that in mind. I'm going to show

you now actually how you can run this.

So let's go down to one of these

terminals. I'm going to go ahead and go

to this one. Now the way I had this set

up on my machine is I did want to do a

venv type of setup. Now if you're not

familiar with Python, it basically like

a virtual environment, right? And so

what you can do is you can go into the

autonomous coding folder and I can just

say source venv bin activate. You might

also have to set one up. Uh if you know

how to do that in Python, it's pretty

easy. But now I have this shell set to

that venv. And then also I had to export

this claw code ooth token which I'll

kind of show you in the codebase where I

had to change that to get this working.

But you have those two things set up and

all you have to do is just run your

autonomous agent. I'm going to go ahead

and just give it a max iterations of

one. Uh and then we're going to go ahead

and say project directory. I'm going to

say awesome

image gin. Okay, so this when you kick

this thing off, it's going to run the

Python script and it's going to run the

initializer, which is the first agent.

And they do kind of talk about this in

the documentation. And if you go back to

the blog post, they do talk about the

init initializer agent. And that is

basically this prompt here. And that

tells it to read through the app spec

and basically create a giant JSON file

of all really simple features and a

couple of steps of how you can basically

verify that the feature still works and

it has a boolean to track if it's been

implemented or not. And then down here

it says, "Read through the app spec and

create a minimum of 200 features." So

all the features you saw in my AI image

design studio. I didn't have to write

these out. It basically just did it for

me. It figured out all the best features

I could use and I just came back and

checked it and kind of refactored the UI

a little bit to make it look a little

bit nicer for this demo. Um, but that's

basically the output that you get. Now

you can read through this and you can

modify. That's the cool thing is you can

modify the initializer prompt and make

it as custom as you want. But overall I

probably wouldn't change this one. And

when it's done though, we can go to the

generations folder. And you can see we

have an awesome image gen uh folder

that's getting created. And soon this

thing will be done. The initializer

script does take around like 5 minutes

to actually finish because it has to

generate a giant JSON file with a ton of

information. Um but when it's done, I

can kind of show you this. Hopefully it

finishes soon. It's then going to start

going through all of the the features

one by one. So let me go to the features

list for the image design studio. This

is something I've already been kind of

working on. You can see it goes through

and it starts just creating passes true

for like tons and tons of features. So

let me just show you how many we have

done now. We have 129 that are actually

passing. So it basically picks one

feature at random and it tries to find

one that has like the highest priority

and then when it finds the highest

priority feature, it then rates the

feature using claw code and then it

tries to use puppeteer. So this is the

the the interesting part and I think the

key takeaway of these autonomous agents

I run forever is that you have to have a

way for the agent to verify what it's

doing. And so the way that this script

kind of works, the Python script works

is it loads up a puppeteer agent and it

goes to your actual web application with

a browser and then it can take

screenshots and it can click around. And

so it basically runs through these

steps. every single time it tries to

implement a new feature, it goes back

and it tries to verify that, hey, like

the feature I just added, is it correct?

And until it can pass these steps, it

will not continue on to the next

feature. And so when it's first starting

off, you'll see it basically load up

Puppeteer, it clicks through these

steps, then it marks this feature as

done and it goes to the next one. Now,

as the project gets larger and you get

to like 50 or, you know, 100 features,

it does start to slow down a lot because

what it does is it picks a couple of

features at random and it tests them out

and it verifies that it still works and

then it moves on and it adds your new

feature and then it runs the test for

that new feature and then it moves on to

the next one. Right? And so you can tell

loading up Puppeteer, taking a

screenshot, sending that screenshot, the

clawed code, it's slow. It's a very slow

process and when it finds a bug with

your tests, it then has to go and try to

fix it and then it spends a bunch of

time doing that. But overall, like this

is some of the outputs that you're going

to get. And I just basically let this

code all night when I was sleeping. I

let it code when I went out shopping and

I didn't have to do much. And I could

see the power of actually hooking this

up to existing projects and having like

a list of features you want to add in

and just letting it cook and come back

and you have like a fully working

prototype that then you can pass off to

a real experienced developer using cloud

code to really start polishing it all,

fixing little bugs here and there and

just taking it from, you know, vzero

iteration all the way to production.

Now, in order to get this running, I did

have to kind of modify the codebase a

little bit to support this new token

because I didn't want to use an

anthropic API key. It does cost a lot of

money. So I have my $100 a month claude

code subscription and I wanted to use

that. So you basically just in this file

I added in has OOTH token and I checked

for both of those. So I think I changed

this stuff. And then if I were to go to

the next file I have this check that

verifies that one of these are set

automaker. I think those are the two

main things that I changed. There might

be some other places but honestly if you

can't get it working I was funny. I was

working with someone on Discord to try

to get this working. We just dropped it

into I think the warp terminal and he

said, "Hey, can you get this working

with my claw code oath token?" And in

one shot it basically went through the

code, it fixed it and we're able to get

it running. So like just remember you

have these tools that you can use to

like get you past any bugs that you run

into. But overall, I mean this thing is

pretty cool. Okay, so it looks like it

just wrote my features list. So I can go

ahead and open that one up and you'll

see here we have a giant features list

inside that new directory, the awesome

image gen. And everything is set to pass

as false right now. So eventually it's

going to start going through and it's

going to start building out this

application from the ground up. Now this

will take a while from the the initial

couple features because it has to go and

set up a next.js application project. It

has to go and bring in Drizzle and do

all the mpm installations and whatnot.

But at some point like after a couple

hours you'll actually start seeing a

fully working application and it

honestly just works. Like you don't have

to even like do any babysitting. It just

kind of works and you just let it cook.

You come back and you'll see all these

cool things that it added in. Now, the

very last step is when it gets past

this, I don't know if I'll stay around

for waiting for this to be done, is that

there's another prompt that it uses for

basically coding it all out. And if I go

back to the repo over here, let's go to

the prompts. And there's one called

coding prompt.

Now, the coding prompt basically forces

the cloud code instance to like really

understand what it's doing. So like the

first step is get your bearings

mandatory. It runs through, it does like

git diffs, it does get git logs, it

reads the app spec, it reads your

feature list. It tries to figure out

what features are implemented and what

are not and it adds that all to your

context window and then it starts and

creates your project if it's not already

running and then it runs up to one or

two random features. It basically just

picks two random. It just runs the

tests. It steps through with Puppeteer

that make sure it still works and then

if everything's good, it'll go and

actually start implementing your new

feature. Um, actually step four is

actually it looks and it finds one that

has the most highest priority feature.

Okay, so usually when you're building

out an application, you do have like a

priority chain of like which one should

come first, what should come second,

which one is dependent on something else

and it kind of figures that out through

step four. Step five, it actually

implements it. So it implements it and

then it verifies the feature works with

those steps. Uh, verify with browser

automation, you must verify the features

to the actual UI. It has some guidelines

that help it actually test properly and

then it updates your feature list.json

and then it commits your progress and

then it updates your notes and then it

ends your session so that the next

session is a completely clean context

window. It can start fresh. Now, this

should show you that I'm not like BSing

you. Like I have the image design studio

here. I have 134 commits and I have a

ton of code already. I do have some

commits that are from me where I kind of

cleaned up the the UI, made it look a

little bit nice. But if I go back to my

first commits, you will see like

literally all 130 of these commits all

coming from the agent just adding

feature after feature. Task 49, task

3840, task 3942. It just runs through

that appspec file in your feature list

and just starts adding stuff. Okay. And

here's the initial one where I basically

set up the feature list and whatever. So

I'm kind of blown away by this. I mean,

I've always thought that the agent

coding was the future, but now this

fully autonomous agents, I never thought

it was going to be a realistic thing

because like, you know, when the Devon

demo came out, like I'm like, "Ah, this

is stupid." But now, I think these LLMs

are so good that we're going to see a

lot more projects being 100% created by

just autonomous agents just running

non-stop. Uh, or even just having agents

that run non-stop to check for security

vulnerabilities, checking for

performance issues, checking and fixing

documentation. you could actually kick

off many of these things just constantly

running and checking your codebase and I

think that's going to be the future and

I think you guys want to be part of it

and so if you do want to be part of that

future go check out agenticjumpstart.com

my course I'm going to walk you through

all the stuff I've learned along the way

with agentic coding and more

specifically I do have this new section

that I'm kind of working on called

autonomous coding which I'm going to

start doing an indepth uh overview of

using this repo and I'm actually

starting to build out some other tools

locally so that I can actually have

autonomous agents running on multiple

projects on my computer at the same

time. But other than that, hope you guys

enjoy this video. Go check it out. Do

not sleep on this. I think this is a

really amazing thing.

I Let Claude Code Build an App for 24 Hours

Web Dev Cody

84 days ago

17:00

Claude & Anthropic Ecosystem

Rank #1

Description

Buy the course now: https://agenticjumpstart.com course. Join the Agentic Jumpstart community: https://discord.gg/JUDWZDN3VT article: https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents anthropics repo: https://github.com/anthropics/claude-quickstarts/tree/main/autonomous-coding --------- Have a video suggestion? Post it here: https://suggestions.webdevcody.com/ My Game https://survivethenightgame.com/ My Courses 🤖 https://agenticjumpstart.com ⚛️ https://beginner-react-challenges.webdevcody.com Useful Links 💬 Discord: https://discord.gg/N2uEyp7Rfu 🔔 Newsletter: https://newsletter.webdevcody.com/ 📁 GitHub: https://github.com/webdevcody 📺 Twitch: https://www.twitch.tv/webdevcody 🤖 Website: https://webdevcody.com

Watch on YouTube

Video Details

Category

Claude & Anthropic Ecosystem

Featured Date

December 9, 2025

Quality Rank

#1

AI Recommended