Self-Improving Skills in Claude Code | DailyDevLists

Loading video player...

Full Transcript

2,030 words • EN

In this video, I'm going to be showing

you how to set up self-improving skills

within cloud code. Now, one of the

issues with LLMs right now is they don't

actually learn from us. Just to run

through an example of this, let's say

you're working on a web application.

There might be a mistake that the coding

harness or the model that you're using

makes within the first iteration of what

it's trying to do. Let's say you want to

add a new feature and then it has a

button as a part of that feature. Just a

simple but relatively common mistake

could be that an LLM doesn't actually

know the particular button that you

might want to leverage. Generally

speaking, you can tell from certain

inputs and buttons what is actually

generated from an LLM. Now, you might

correct that mistake and say, "Okay, I

actually want you to reference this

button." But the issue with this is when

you actually correct it within that

session, when you pick up in a second

session, it's going to make that same

mistake again. And you're going to have

to correct it or remember to actually

specify to reference that particular

button. Same thing for the next session.

And this loop will continue. Every

conversation effectively starts from

zero. And the thing with this problem is

it touches every single model that is

out there as well as every single coding

harness. Not having a good effective

memory mechanism within the harness in

my opinion can definitely lead to a lot

of different frustrations. Now this

frustration can come up in a number of

different ways. It might not follow

naming conventions. Use the proper

logging convention. It might not

validate inputs the proper way that you

did within other components. Had that

experience where you're just thinking, I

just told you this yesterday or I told

you this last week. The issue with this

is there's no memory. your preferences

aren't persisted and effectively without

some form of memory is you're going to

be repeating yourself forever. The

solution to this is relatively simple.

We can actually set up a reflex skill to

analyze the session, extract corrections

and update the skill file. One thing

that I've been playing around with is

for my global skills that I use across

my machine is I have all of those

different skills versioned on GitHub as

I have them reflect and iterate on those

particular skills. I can see all of

those different memories over time and

if there are regressions and if I want

to roll it back, it makes it easy to

have it all within the version control

within git. So now the way that I've set

this up is there's a few different

mechanisms to this and it's relatively

simple. I have the ability to turn

reflect on, reflect off, and then

reflect status. There's two different

ways that we can do this. There's a

manual way, and then there's also an

automatic way. First, let's touch on the

manual flow. There's a skill called

reflect, and then there's a slash

command. As soon as you go through a

conversation and if there's something

that you want to have it remember, you

can simply call that slash command and

it will have the context of the

conversation and then it will reference

the particular skills and then it can go

and update those accordingly. And the

nice thing with the manual update is

you're going to have a lot more control

in terms of what is actually being

updated within the skill file. Just to

go through a hypothetical example, so

you might leverage the skill, it might

say here's my review of the O module and

you might realize, oh, it's actually not

looking for SQL injections. we could go

and specify always check for SQL

injections and then from there cloud

will go in the current session check for

SQL injections similar to the button

example that I had and then ideally it

will come back and show you that it's

done and the really nice thing with this

is corrections are all signals that

could be good memories approvals are

further confirmations and the reflect

command and skill will extract both of

these and then after that process all

that we need to do is actually run the

command to reflect we have two different

ways that we can do this we can run the

reflect command And or we can also

explicitly pass in the skill name as

well. But if you just pass in reflect,

it will have the contextual awareness

since it is within that thread to know

when that skill was actually invoked.

Effectively, Claude will analyze and

scan the conversation for corrections.

It will identify success patterns, post

skill updates, and the way that this is

set up is it will give you a breakdown

of different confidence levels. There

will be high, medium, as well as low. If

I say never do X, like never come up

with a button style on your own within

this project, you can go ahead and

specify something like that. Medium are

going to be patterns that worked well.

And low are going to be observations to

review later. And all of this works is

just through this skill file. You're

going to be able to edit this, tweak it

if you want to have version control, or

if you don't, you can go ahead add in a

G integration. Additionally, you can

just remove that if you don't want to

leverage it. I'll link all of this

within the description of the video

before it actually updates through

respective skill. This is what the

review and approval process looks like.

We have the signals that were detected.

We have the proposed changes. And then

we have the commit message that it's

going to add if we go and accept those.

Additionally, what we can do within here

is we can just change and we can change

with natural language. That's one of the

really nice things with this in terms of

actually applying these to our skills

directory as well as pushing them to

get. We can either click Y and or we can

type with natural language the different

changes that we want to have within

Cloud Code. And then once you've either

made those changes or you've accepted

what Claude has proposed, it's going to

edit the particular skill and then it's

going to go ahead and commit that within

Git. And then it's going to go ahead and

push that up. And one thing about this

process that I did want to have within

at least my setup is for all of those

different changes that it makes within

the skill, make sure you're actually

versioning all of those as Next up, you

can actually take the same flow and you

can automate it. You can have hooks

trigger reflections automatically. Now,

if you haven't used hooks before,

effectively what they are commands that

run on different events. Now, there is a

stop hook, and this is something that I

covered in an earlier video on the Ralph

Wigums loop where what you can do is to

have Claude persist and run

automatically. You can actually bind a

shell script to invoke and have Claude

continue whenever that stop hook is run.

But it can also be perfect for end of

session analysis just like this. Now the

syntax is broken within this example

here, but effectively what this is going

to do is on the stop hook, we're going

to go and trigger that shell script to

reflect. If you are going to be running

this automatically, you do want to have

a lot of confidence in terms of that

reflect mechanism and what it's actually

doing. But what it will do is you will

go through the process just like before.

And then once the session ends, the hook

is going to analyze and automatically

update all of those different learnings.

This is going to be that continual

self-improving loop that you can have

within cloud code. You can very well

also leverage the same strategy of

continual learning within other agentic

systems as well. And so what it will do

is in that button example, it will go

ahead, it will learn from the session.

Then what it will look like within cloud

code is we'll see learn from session and

it will have the skill that it updated.

So it's effectively more of a silent

notification, but just like this

indication like you see on the screen

here that it actually updated that

particular skill. And then in terms of

the reflect shell script that gets

invoked on the stop hook, we can turn it

on. There's a mechanism to reflect on,

reflect off, and this is effectively

going to work the same way as the

reflect pattern that we had, though just

being automatic. The one thing that I

find exciting about this is you can

leverage skills for a ton of different

things. This can be for code review, API

design testing documentation amongst

a ton of other use cases. And having

skills actually be able to learn from

your conversation, I think, can be

something that is pretty powerful. and

also having it within skills. You don't

have to worry about embeddings and

memory and all of the complexity that

comes with typical memory systems that

we see out there. This is going to all

be within a markdown file that you can

simply read with natural language. And

now the other thing that I like about

this is actually having it within git

cuz you can see how the system learns

over time. If you have a front-end

skill, you can see all of the different

things that are learned as it goes

through instead of actually having to

start from blank every single time. But

I think the more interesting aspect of

this is you can see how those skills

evolve over time and how your system

gets smarter over time as you have

conversations with it. You're going to

be able to see all of the different

learnings for the particular skills if

you are to leverage this within Git as

well. And just to wrap up, if you aren't

as familiar with agent skills, I'll put

a couple links within the description of

the video. I'll also do some other

videos probably over the course of the

month on this type of topic as well. So

feel free to subscribe if you're

interested in this type of content.

Okay. Okay, last but not least, just to

sum up what we've touched on, there's a

couple different ways to do this. You

can do it through the autodetect method,

you can do it through the manual method,

or you can toggle on and off and do a

little bit of both. If you do want to

leverage the auto detect method, see how

it works for a little bit. You can try

that. Additionally, I'd encourage you

just get familiar with the actual

reflect mechanism. I'll put a link to

the working copy of the one that I'm

leveraging within the description of the

video if you're interested. And then we

also have the toggle mechanism. So, if

you want to use a combination of manual

as well as automatic, you have to turn

on that auto detect mechanism when it's

triggered within the hook. Okay. So, all

in all, the goal with this is to correct

once and then never again. This is a

start. I'm not saying this is definitely

the end solution, but hopefully it

inspires some ideas in terms of how you

can leverage skills, self-improvement,

as well as continual learning.

Otherwise, if you're interested in this

type of stuff, follow the channel. I'll

be covering some more ideas in and

around this type of stuff over the

coming weeks. But otherwise, if you

found this video useful, please comment,

share, and subscribe.

Self-Improving Skills in Claude Code

Developers Digest

56 days ago

8:36

Claude & Anthropic Ecosystem

Rank #1

Description

Setting Up Self-Improving Skills in Claude Code: Manual & Automatic Methods In this video, you'll learn how to set up self-improving skills within Claude Code. The tutorial addresses the key problem of Large Language Models (LLMs) not learning from previous interactions, causing repeated corrections in coding tasks. The solution involves creating a reflex skill that can analyze sessions, extract corrections, and update skill files. The video outlines both manual and automatic methods to implement these skills, leveraging Git version control for iterative improvements. By the end of this tutorial, you'll be able to continuously improve your coding harness, ensuring more efficient and less redundant coding sessions. Repo and links coming shortly! 00:00 Introduction to Self-Improving Skills in Claude Code 00:03 The Problem with Current LLMs 02:11 Manual Skill Reflection 04:51 Automating Skill Reflection 06:26 Benefits and Conclusion

Watch on YouTube

Video Details

Category

Claude & Anthropic Ecosystem

Featured Date

January 5, 2026

Quality Rank

#1

AI Recommended