Loading video player...
- I think back to when I first started learning coding,
I was the kid that sat in the back
of math class in middle school,
and I had my little TI-83 Plus calculator.
And we just program it with BASIC,
'cause at some point I realized
that I can actually program the answers
for the math test into the calculator .
Hey, I'm Alex.
I lead Claude Relations here at Anthropic.
Today we're gonna be talking about Claude Code
and the future of software engineering.
And I'm joined by my colleague Boris.
- I'm Boris.
I'm a member of technical staff here at Anthropic
and creator of Claude Code.
- A lot has happened in the past 12 months,
and things are moving very, very fast,
especially in the coding domain.
For folks that, you know,
maybe aren't following the news every single day
or even staying on top of the latest,
and I have trouble myself sometimes,
can you kind of catch us up here on what's happened,
and where are we standing currently?
- Yeah, a year ago coding was totally different
than what it is today.
A year ago, if you want to write code,
you have a IDE,
you have some sort of autocomplete in the IDE,
and then there's some sort of chat app,
and you might like copy and paste code
back and forth a little bit.
And that was the state of the art, that was AI in coding.
And I think maybe sometime around a year ago
we started to see agents appear
as a thing that people earnestly use in coding.
It's like a part of the workflow.
It's not like a gimmick or a prototype.
It's actually part of the inner loop when you're doing dev.
And I think this is the thing
that's changed the most in the last year,
is now when you code, you use an agent,
you don't directly manipulate text in an IDE anymore.
It's not just about tab;
it's about the model writing code for you.
And I think what we've started to see
is the shift from directly manipulating texts
to having the model do the text manipulation for you.
And I think projecting it out,
this is sort of the trajectory that we're on,
is this continuing into the future.
- I see, so we've gone from
it all being within a web app
where you're copy and pasting the code out
and you're making like very targeted edits, almost,
to just being a lot more hands-off
and telling an agent what you want it to do,
and then trusting it to go make tons of edits
and create whole apps sometimes even by itself.
- Yeah, exactly.
And this was something that
I think the reason we couldn't do it a year ago,
and, you know, like, people have tried to make AI do coding
for the longest time and to, you know,
just like automate more and more of coding in various ways.
And it hasn't really worked, I think,
probably for a couple of reasons:
one is the models weren't really good enough,
and the second one is that like the scaffolding,
the thing on top of the model, wasn't good enough.
And when we initially launched Claude Code,
the very, very first versions late last year,
I think it was still using Sonnet 3.5.
This wasn't even 3.6,
or whatever we call this thing,
the new Sonnet 3.5.
- Yeah, upgraded Sonnet.
- Yeah. It wasn't even this.
And it like sort of worked, you know?
Like, I used it for maybe 10% of my code
or something like that.
But even then, I remember when we launched it,
we gave it to the core team.
And it was just me
and like a few other people on the team at the time.
And I remember walking in one morning,
and kind of on the way to to my desk,
there was a few engineers sitting there;
and one of them was Robert,
and there was a couple of other engineers.
And I just walked in,
and I saw Claude Code on their screen the first time.
And like I just gave this to them the day before
and they're already using it.
And it was just the craziest thing.
And the model wasn't very good.
The harness wasn't very good.
But even in this early version,
it was already a little bit useful.
And I think that over the last year
what's happened is the model has gotten way better
at agentic coding,
and that's happened with like 3.7 and now 4.0 and Opus 4.1.
And the harness has also gotten a lot better.
And, you know, obviously the harness is Claude Code,
because the way you interact with the model,
you can't just like directly use the model:
you have to use a harness.
It's sort of like, you know,
like if you're riding a horse, you need some sort of saddle.
And like that saddle makes a giant difference
when you're riding a horse.
I'm not a horse rider.
- I like that analogy, though.
I mean, it is kind of like Claude is the horse,
and as the engineer you're trying to
get it to go in a certain direction
and you're trying to guide it,
and like you need some sort of scaffolding around it
to be able to steer it correctly.
And the harness in this case,
just so we're on the same page,
is everything from like the tools we're giving it
to how we handle like the context and everything
for the model.
- Exactly, exactly.
It's like all of Claude Code.
Like, the model is the thing behind the API,
and then Claude Code, it's the system prompt,
it's context management,
it's tools, it's the ability for, you know,
to plug in MCP servers,
settings, permissions, all this kind of stuff.
All of this interfaces with the model.
And the model sees all the context,
all the output from this stuff,
and it makes a giant difference in the way that it performs.
And I think over the last year
we've learned how exactly we build for the model.
And the model has kind of coevolved
with not just Claude Code
but all these different products
that are using Anthropic models
to build agentic coding tools.
- Maybe let's speak more on that.
When you say coevolve,
is that because it's like a deliberate thing
in which we're doing with the training, or
how is the model also getting better
at these sorts of things?
as we make the product features itself better.
- It's pretty organic, honestly.
Like, you know, at Anthropic, everyone uses Claude Code.
And that includes the researchers.
And so every day the people building the models
are using the model in order to do their job.
And I think as part of that
you kind of see these natural limits
that you hit with a model.
So, you know, as an example,
maybe the model's really bad
at doing certain kinds of edits.
And sometimes when you use Claude Code,
you see like, oh, failed to replace string,
failed to replace string.
Like, this is a model capability,
and we can improve this if we learn from it.
Or another example,
maybe something like higher level
is if you just let the model cook for like 30 minutes,
with 3.5, it could kind of do it for a little bit,
maybe for like a minute or something it would stay on track.
And then with newer models
it kind of gets longer and longer
this amount of time the model can operate autonomously.
And I think this is really based on experience,
because you use the model,
you kind of see where as a human
you have to course correct and steer it.
And then we've learn from that,
and we can kind of incorporate that into the model
and teach it better to do this itself.
- When you're evaluating a new model,
do you kind of have a vibe check set of tests that you run?
Or if it's like a new feature that we're rolling out
to make something better in the harness,
how do you personally evaluate
if the performance is getting better?
- I just do my work that day.
- Interesting. - Yeah.
Like, my perfect day is I'm just coding all day.
And, you know, whatever the model is,
whatever is the new thing we're testing,
I'll just code using that and see what the pipe is.
There isn't like a specific thing I do.
- Right, you just see how does it actually work for me
in my day to day?
- Yeah. Exactly, exactly.
And, you know, like in day-to-day work
you do all sorts of stuff.
Like, you're writing new code,
you're maybe like fixing bugs,
you're maybe reading Slack messages
or GitHub issues to respond to feedback.
And I think more and more
the model is able to do more and more of this.
So actually, in a way, if you had maybe one thing
that you always use the model for,
you would miss out on some of these newer capabilities,
like pulling in context through MCP,
like reading your Slack messages.
Or, you know, automatically debugging stuff,
'cause you can pull in Sentry logs automatically.
- Yeah, so the best eval in some sense
is the one that most looks like real life.
And in that case, just using it
gives you the best result.
- We tried really hard, when building Claude Code,
to build a product evals.
- Yeah.
- You know, just like to have some sort of benchmark;
like, when we change a system prompt or whatever,
is the model getting better?
And we have a little bit of this,
but honestly it's just like so hard to build evals.
And by far the biggest signal is just the vibes.
Like, does it feel smarter?
'Cause there's such a broad range of tasks they use it for.
- Yeah, that's actually a question
I hear from developers all the time,
is they would appreciate more guidance on
how we go about prompt testing and iterating.
I know for different products
we have like various sorts of evals
that we've tried to create,
but for Claude Code it really is
just kind of this tight feedback loop
that almost gives us like more immediate signal
than any hardcoded set of evals.
- I wonder if people kind of want to hear
a better answer from an AI.
But yeah, man, it's all vibes.
I think at this point we're, you know,
the models are doing so good on evals, like SWE-bench.
You know, we're just trying to find these harder evals.
And now there's like T-bench,
which is like a little bit less kind of saturated.
But I think it's just really hard to find synthetic evals
that capture all the complexity in software engineering.
- Right, right.
Do you think there's something we did
uniquely to set up that feedback loop internally?
'Cause I feel like Claude Code has like the best
dogfooding cycle I've seen of like any type of product.
- Initially, I built it the way that I do any other product,
which is just listen to users
and make it as easy as possible to listen to users.
And I think one part of it
is when we built Claude Code,
there was just like a single feedback channel in Slack.
And anytime anyone had feedback,
I would just direct them to that,
just be like, "Yeah, post there."
And I feel like people hesitated sometimes a little bit.
'Cause sometimes when you give feedback,
you expect that no one listens
and it kind of goes into this black hole, like into a void.
And I think one of the things that we did really right was,
from the beginning, whenever someone gave feedback,
I would try to fix it as fast as I can.
And sometimes I would kind of go into the office
and then just spend like three hours
or two hours or whatever,
just go through as many bugs as I can
and fix them as fast as I can,
and then every time comment back and tell people it's fixed.
And this kind of encourages them to keep giving feedback.
And to this day the Claude Code feedback channel internally
is just this fire hose, just nonstop.
- Oh, totally.
I remember, on those early days, and still do,
dropping in there, posting something,
and immediately your emoji reacting.
Or you're asking for more clarification and more questions,
and you do feel like, oh, okay, my feedback's being heard.
And then you're able to like actually be,
you know, incentivized
to go post more feedback in the future.
- Yeah, 'cause, you know,
honestly, like, I don't know what I'm doing.
Like, no one really knows what they're doing with AI.
Like, we're kind of discovering this thing as we build it.
And the best indicator is what the users want.
So you gotta listen.
- Right, switching gears slightly,
what is like the current state of Claude Code as a product?
What are the latest features? What are you excited about?
Some things that you're seeing folks do with it right now?
- Claude Code, from the start,
was built to be the simplest thing it can
and to be as hackable as possible.
And I think the hackability is something
that we've been developing a lot,
and that's something I'm really excited about.
So originally, the way to hack Claude Code
is adding to its CLAUDE.md.
That was the original extension point.
And CLAUDE.md, as you know, is like this file.
You can put it in the root directory,
you can put it in child directories.
There's kind of different places you can put it.
And it's just additional context to give Claude Code,
and it kind of goes with your repo.
You often check it into your code base.
So it's kind of, you know,
a little bit more information about the code.
But over time we've added a lot more extension points.
So now there's a very sophisticated setting system
and permission system.
There's hooks now which Dixon built.
Dixon's an engineer on our team,
and he just kind of saw all these different user asks
coming in for: "I want to extend it this way.
I want to hook into this, hook into this."
And so he built a super extensive hook system.
MCP, obviously, this is a really great extension point.
and now there's slash commands and subagents.
And user-defined slash commands
is something we've invested in a lot.
And the idea is it's just a workflow:
it's like a markdown file.
You put it in your code,
and it's something that you can reuse a lot.
So for example,
I have a slash command for making commits.
And I have some instructions in there:
here's how you write a good git commit.
I pre-allow the git commit Bash command
so I don't have to accept it every time,
and the model can just do it.
So I think slash commands are really interesting,
and agents are kind of a different view of slash commands.
Like, it's like a slash command,
but it has a forked context window.
And so you can kind of think of agents and slash commands
as two sides of the same thing.
And this is also very exciting.
It's just another way to extend Claude Code.
And so when I look at the future,
I think a lot of it is just about
like how do we extend Claude Code more?
How do we make it easier for other people to build on top?
How do we make the SDK more useful for people?
So it's useful for code if you want to build a coding agent,
but also you can use it for other stuff.
Like, anything that you need an agent for,
you can just use the SDK for.
And I think these are the things
that I'm the most excited about.
And obviously all of this benefits
from all the other work we're doing
to make the model more autonomous,
to make it work for longer periods of time,
to make it better adhere to instructions,
to make it remember things better.
And so everything along the way it benefits.
- So I'm using Claude Code,
or whatever form of it, in six to 12 months;
what does my work actually look like?
Am I reviewing PRs all day,
or what does it day to day break down to?
- Yeah, I think there's gonna be a mix
of more hands-on coding.
I don't think that's going away.
And maybe it'll look different, though.
So maybe hands-on coding today
is directly manipulating text,
but in the future it might be using Claude
to manipulate the text for you.
And then I think there's gonna be this other bucket
of maybe less direct coding
where Claude proactively does something,
and maybe Claude even reviewed it.
And it's your job to decide if this is a change
that you want or not.
And I think maybe 12 or 24 months from now
we're gonna start seeing Claude that's more about goals
and more about these higher level things that it needs to do
and less about the specific tasks that go into it.
The same way that, as an engineer,
I think about what is it that I want to do
over the next month.
And I kind of make small changes to work towards that.
Maybe Claude will go through the same thing.
- Right, sort moving up and up the stack, to some degree,
of these like abstraction levels of getting Claude
to make individual changes to files,
to getting Claude to make changes to a whole PR,
to getting Claude to think about a goal of building an app
or whatever else it is.
- Yeah. - Okay.
That's interesting.
If I'm an engineer and I'm hearing that,
it seems like there's gonna be a lot changing
in a very short amount of time,
especially with my role and what I should be doing.
What's your advice for folks out there
that are looking to prepare themselves
and adapt to this world?
about what they should be learning
or what skills they should be developing.
- I think back to when I first started learning coding;
I was the kid that sat in the back
of math class in middle school,
and I had my little TI-83 Plus calculator.
It was like a transparent gray one;
you can kind of see the circuit.
And we just program it with BASIC,
because at some point I realized that
I can actually program the answers
for the math test into the calculator .
And you can get better grades that way.
And
there's just something about kind of this visceral feeling
of being able to hack, and having this idea
of maybe there's this one program I can make;
and just I go into my calculator and I code it,
and then I can just restart and use it really quick:
this kind of feedback cycle that was really amazing.
And it made it possible for me to build stuff
that I never could have before.
And it was just so easy to get started.
And I think about the difference
between that world and the world before agentic coding,
where stacks just got way, way too complicated.
You know, if I wanted to make a JavaScript,
you know, like website,
I had to learn about React and maybe Next.js,
and then three different build systems and a deploy system.
And it was just so complicated.
And I think one really cool thing about agents
is that they're changing this.
So with coding agents
it makes it really easy to get started.
And if you have an idea you can just build it.
And it's a lot more about the idea now
than it is about the details,
because just like Claude Code,
you can rewrite the code over and over.
And, you know, Claude Code itself, we rewrite all the time.
And I think this is just something
that coding agents enable.
The code itself is no longer precious.
And there's still an art to writing it,
and, you know, all stone code by hand sometimes.
And one of the engineers on the team, Lena,
she was talking about how on the weekends
she still sometimes writes C++ by hand,
just 'cause it's fun.
And, you know, as a coder,
it can be a really joyous thing to do this.
But I think more and more
it's gonna be about the thing you make
and not about the process of making it as much.
And I think my advice for people learning to code today
is you still have to learn the craft.
So you still have to learn to code, learn languages,
learn compilers, runtimes, how to build web apps,
how to build programs,
system design.
You still have to know all the stuff,
but also just start to get more creative.
And, you know, if you have an idea for a startup
or an idea for a product, you can just build it now
in a way that you just couldn't before.
And we don't really understand what this means,
but there's just so much potential
that's about to be unlocked because of it.
- Yeah, I love that.
I think that's great advice too.
Ideas suddenly become something you can action on in,
you know, a span of a few minutes almost;
whereas before it could be just in your backlog forever.
Before we wrap, I want to ask you,
as the creator of Claude Code,
what are your best practices for using Claude Code?
And any tips or tricks.
- Yeah, I think the biggest thing that I recommend,
okay, maybe two tricks.
So one thing I recommend
is that if you're brand new to Claude Code
and you haven't used it before,
don't use it to write code.
I know it sounds crazy. - Yeah, explain, explain.
- But you gotta stop yourself.
Like, don't use it to write code yet.
The thing to start with is use it to ask questions
about the code base.
So you can ask, you know, if I want to make,
if I want to add a new logger, how do I do that?
And then ask Claude Code to explore the code base
and figure it out for you.
Or why is this function designed the way that it is?
Claude Code can go in and it can look through Git history
and it can answer this stuff for you.
So I think ask Claude Code questions about the code base
and just don't code yet.
And then once you feel comfortable
with using Claude Code this way
and you get comfortable with this idea of an agent
that's doing this research for you,
then start to use it to code.
I think the second thing is
when you are using Claude Code to write code,
think about what kind of work do you want to do
and like how big is the task?
So for something that's really easy,
I kind of, in my mind,
I have these three categories:
easy, medium, and hard, very roughly.
And so easy tasks are something
that Claude can write in one shot;
like one prompt, it'll get it pretty much right.
And nowadays I'll just go to GitHub
and I'll tag @Claude on an issue
and just have Claude write the PR for me.
And this is how I do easy tasks,
'cause that frees up my terminal.
I don't have to kind of spend it on this.
Medium tasks, I'll start it in the terminal,
and I'll start in plan mode.
So just Shift + Tab into plan,
and I'll align on a plan with Claude first.
And then once I feel good about the plan,
I'll go into auto-accept and I'll have it implemented.
And then for really hard tasks,
I'm still the one driving,
and Claude is more of a tool.
And I'm kind of pairing with it.
But really I'm the one in the driver's seat,
not Claude for this.
And so I'll use Claude maybe to do some code-based research,
maybe prototype a few ideas,
maybe I'll just like vibe code a few options
to understand the boundaries of the system
and what works well.
But I'll still mostly implemented myself.
And maybe Claude will write the unit tests,
but it's still mostly me doing the coding.
So I think that'll be the second advice,
is just think about what's the task that you're doing
and what's the right way to use Claude Code to do it.
- Those are great tips.
Really, really appreciate the time, Boris.
This has been awesome. Thank you.
- Yeah, thanks, Alex.
Anthropic's Boris Cherny (Claude Code) and Alex Albert (Claude Relations) discuss the current and future state of agentic coding, the evolution of coding models, and designing Claude Code's "hackability." Boris also shares some of his favorite tips for using Claude Code. 0:00 - Introductions 0:39 - The current state of agentic coding 5:20 - The evolution of coding models 7:39 - Coding model evaluation 8:56 - Claude Code user feedback loops 10:34 - The “hackability” of Claude Code (CLAUDE.md, MCP, slash commands) 13:11 - The future of agentic coding 14:49 - How to upskill for agentic coding 17:49 - Claude Code tips and tricks Learn more about Claude Code: http://clau.de/future-of-agentic-coding Check out the Claude Code docs: https://clau.de/claude-code-docs