Loading video player...
In this video, I'm going to be going
over continual learning within cloud
code. One of the big problems today is
if you were to write an AI agent,
generally speaking, the process will go
something like this. You'll write a
system prompt. You'll add rules,
constraints, test, find edge cases, and
you will repeat this process until
something actually works. But one of the
issues with this is every insight as
you're going through and actually
building this out is going to be
manually encoded. when you actually add
to your system prompt, whether you're
doing that with AI tools or a
combination of writing things manually.
One of the problems with this is the
agent never actually learns on its own.
What is the solution to this? Now, if
you've used cloud code before, you've
probably heard of skills. Now, a lot of
people were excited about these for a
number of different reasons. They're
efficient with context. They're
composable. They're portable. They're
efficient. They're discoverable. You can
just put them on GitHub and you can
download this markdown file, potentially
some scripts, and it's really easy to
actually have these be invoked. But one
of the big unlocks with skills that I
don't see enough people talking about,
Claude can read and write to these. And
what this means is that the model can
actually improve them with every
session. So you can set up a slash
command to have a retrospective at the
end of your coding session where it can
go through whatever happened and
actually update the particular skills
that you were using. Additionally, what
you could do is you could actually just
encode this within something like your
cloud.md and have that happen
automatically. In terms of skills, if
you're not familiar, what you'll have is
you'll have a directory. And within the
directory, you'll have all of your
different skills. In each directory, the
one thing that you do need is a
skill.md. Now, within that skill, you
can include other things that it could
progressively disclose, like say if
there's scripts or references or other
helpful assets that you do want to
leverage. In terms of how you can
actually set these up within cloud code,
there are a number of places where you
can put them. So, you can put them at
the root of your computer. So, you're
going to be able to access them whenever
you want. And additionally, you can have
them at the project level or you can
actually have it within a plug-in that
you can share with others to be able to
easily install. Now, in terms of the
format for skills, they're super
straightforward. So, you can come up
with a name, a description. Now, the
description is really important because
this is going to be what actually is
within the context of the orchestrator
model or the main thread to know when to
actually invoke that. So, make sure you
have a good description. Now,
additionally within skills, you can give
it particular tools and you can also
reference within the file if there are
other helpful things that could
potentially be useful. You can put them
all within here and it doesn't
necessarily load up all of that context
within the skill.md file. It can go and
reach for those things progressively.
The cool thing with this is you're only
going to be using a number of tokens
within the description to actually have
it within the main context window. And
then all of the other aspects below this
is only going to be loaded once that
skill is actually triggered. Now, in
terms of how skills work, one of the
really cool things with this is called
progressive disclosure. So, Claude will
load up the skills names and the
descriptions. And what it will do is it
will request matches for a skills
descriptions and it will ask for
confirmation before loading. Now, I'm
not going to go through and actually
show you how to set this up. It's really
straightforward to set up slash
commands. You can do a little bit of
Googling or you can actually just ask
Claude Code itself to set up these slash
commands. Now, in terms of setting up a
learning loop, it's really quite
straightforward. I'm not going to be
going through and showing you step by
step in terms of how you can actually
set up these triggers, how you can set
up these slash commands, but effectively
what you can do is you can query for
your skill registry prior to different
learnings. It can surface the relevant
past experiments, show known failures as
well as provide the working
configuration and additionally at the
end of it, you can provide a
retrospective. So once all of that is
within context, you can go ahead and run
an update process where Claude will read
through the entire conversation, extract
the relevant pieces of what worked as
well as what failed. You could set it up
so it actually opens a PR if it's within
a registry. Or you can actually have it
just write to that skills MD file or all
of the different files that you have
within the skill directory. Now, another
helpful thing with this is you can
actually document failures. One of the
things that I did when I was setting up
a project open lovable is I spent an
awful lot of time with the system prompt
and it was basically a lot of do this
don't do that and me just going through
the cycle like I mentioned earlier where
you're writing out a system prompt
you're trying different things you're
finding edge cases and you're repeating
that within a cycle. Now, one of the
things with failures that I don't think
a lot of people are capturing is you can
actually use these failures to inform
which things to skip because when you
start up a new session with the model,
it's not going to have the context of
all the things that it does bad. It
isn't necessarily as intuitive to
actually encode failures or put failures
within a place. That's something that we
don't typically want to do within
software. But because large language
models are non-deterministic, actually
having some examples of where it can go
off the rails can be very helpful. And
this goes both ways. having examples for
failures as well as successes. These can
help improve these skills over time.
Now, I want to pull up a tweet from
Robert Nishihara. This is the CEO of any
skill, an inference provider. And one of
the things that he said when skills came
out, I'm just going to read through this
is the thing that excites me about
anthropics agent skills announcement is
that it provides a step towards
continual learning. Rather than
continuously updating model weights,
agents interacting with the world can
continuously add new skills. Compute
spent on reasoning can serve dual
purposes for generating new skills.
Right now, the work that goes into
reasoning is largely discarded after a
task is performed. I imagine vast
amounts of knowledge and skills will be
stored outside of a model's weights. It
seems natural to distill some of the
knowledge into the model's weights over
time, but that part seems less
fundamental to me. Now, some of the key
aspects that he mentioned is there are
many nice things with storing knowledge
outside the model. It's interpretable.
You can just read through the skills.
You can correct mistakes. The skills and
knowledge are just plain text, so
they're easy to update and they should
be highly data efficient in the same way
that in context learning is data
efficient. And I think some of the key
aspects of this is you can just read
through this. You can just edit it. If
there's something within natural
language, you can just see that is not
what I want to do and I'm just going to
go ahead and update that. It makes it
much much easier than actually having to
go and retrain or post-train a model
where you don't necessarily know exactly
what's happening under the hood. With
skills, you're going to know exactly
what it's doing because it's written in
English. One of the key insights with
this is that the knowledge stored
outside the model's weights in skills we
can read, edit, and share. And not to
mention, every session's reasoning can
compound into future skills. What you
can do with continual learning is create
effectively a flywheel where this will
just get better over time and learn from
mistakes or as things change or as the
environment updates and it needs to
leverage different libraries or leverage
whatever the skill is actually using.
Now, in terms of getting started with
skills, if you haven't tried them, there
are a number of really great examples on
Anthropics repo on this and I'll also
put this within a link within the
description of the video and what you
can do with skills. What it is is just
effectively another way that you can
actually leverage different tools as
well as pass in different contexts at
progressive disclosure and then like I
mentioned you can actually leverage it
for continual learning. Now, in terms of
how you can leverage skills, so you can
do this for personal things. So, if
there are things within your day-to-day
job or things that you do personally and
create custom skills very easily, just
write out natural language, equip it
with particular tools, and have it learn
over time whatever you're actually
wanting that skill to do. Now, the other
benefit of this is you can actually have
them at the project level. And what's
nice with this is you can have repos.
And where that can be helpful in a team
setting is when you share that and
someone else is leveraging a system that
can leverage skills, they're going to be
able to inherit all of those particular
skills that are specific to the project.
Now, additionally, you can set this up
where you can share it via a plug-in or
a registry. If you do want to try and
set this up at a plug-in level as well.
And what's neat with plugins is you can
set it up to leverage different MCP
servers, skills, as well as hooks. You
can effectively have a whole config of a
number of different tools. Now, within
the skills repo from Anthropic, there
are a number of great skills that you
can build on top of. And just to give
you some ideas, so they do have a
front-end design skill. They do also
have a web app testing skill. Just to
give you an idea in terms of how these
can be leveraged, if you're working on a
web application, if this skill is
installed, you can say test my
application and leverage all of these
different tools to test your
application, things like Playright or
the Chrome MCP, so on and so forth. Now
another thing in terms of how you can
actually leverage these learnings and
continual learning and this is a little
bit outside of cloud code per se but
what you could do is you could actually
take these learnings and improve your
system prompt like I mentioned within
the open lovable example of spending
time writing a system prompt or just
doing that generally for any agentic
system. What you can do is as you
capture failures and successes, you
could potentially even set up a system
where it will PR your system prompt or
even PR your skills if you have them
within Git. There are a ton of really
cool things that you can do with this.
Now, all in all, skills aren't just
instructions. They're persistent team
memory that can compound with every
session. I don't see a ton of people
doing really interesting stuff with this
quite yet, but hopefully with this
video, it can inspire some ideas in
terms of how you can leverage skills and
continuous learning within Cloud Code as
well as other agentic systems.
Otherwise, that's it for this video. If
you found this video useful, please
like, comment, share, and subscribe.
Otherwise, until the next
Unlocking Continual Learning in Claude Code with Skills In this video, we delve into the concept of continual learning within Claude Code. The traditional approach to developing AI agents involves manually encoding insights and repeating a cycle of writing, testing, and refining system prompts. Learn about the innovative solution provided by Claude's 'skills,' which are efficient, composable, portable, and discoverable. Explore how Claude can read and write to these skills, improving them with every session. Understand the setup process for skills, the significance of skill descriptions, progressive disclosure, and creating a learning loop. Additionally, discover how documenting failures can inform the process, and hear insights from Robert Nishihara, CEO of Any Scale, on the benefits of storing knowledge outside the model's weights. Get inspired to leverage skills for personal projects, team settings, and enhancing system prompts, turning them into persistent team memory that compounds with each session. Anthropic Skills; https://github.com/anthropics/skills 00:00 Introduction to Continual Learning in Claude Code 00:03 Challenges in AI Agent Development 00:34 Introduction to Skills in Claude Code 01:19 Setting Up and Using Skills 02:39 Progressive Disclosure and Learning Loops 03:45 Documenting Failures and Successes 04:42 Insights from Industry Experts 06:27 Getting Started with Skills 06:48 Leveraging Skills for Personal and Project Use 07:59 Advanced Uses and Future Potential 08:30 Conclusion and Final Thoughts