How Agents Use Context Engineering | DailyDevLists

Loading video player...

Full Transcript

3,126 words • EN

Hey, this is Lance from Langchain. I

want to talk of a few general context

engineering principles and how they show

up in various popular agents like manis

like cloud code and also in our recently

released deep agents package and CLI.

So first agent can be simply thought of

as an LLM calling tools in a loop. An

LLM kind of makes a tool call. Tool is

executed. observation from a tool goes

back to the LM and this continues until

some termination condition. Now the

length of tasks that AI agents can

perform is getting longer. A nice result

from meter shows that the task length is

doubling around every seven months. Now

the challenge with this is that as

agents take on longer tasks you

accumulate more tool results. For

example, Manis mentioned that the

average Manis task is over 50 tool

calls. Likewise, Anthropic has mentioned

that production agents can often be

hundreds of turns. The problem is that

as you populate the context window with

results from all these different tool

calls, you're passing all those prior

tool results back through the model at

every turn. And so the cost and latency

associated with running your agent can

really blow up. And not only that,

performance can degrade. So, Chrome has

a nice report on conx rot that discusses

how performance degrades with respect to

context length.

And so, what we've seen is that agents

are increasingly being designed with a

few different principles to help address

this.

Of course, agents have a few common

primitives, a model, prompting, tools,

and often hooks.

Take cloud code as example using cloud

series models. The system prompt is

actually available. You can look at it

at this link here. I'll make sure that

this document is in the video

description. It has around a dozen

native tools and it does allow for hooks

which are basically scripts that can be

programmatically run at different points

in this agent life cycle. For example,

before each tool call or after each tool

call. Now, our deep agents package and

our deep agent CLI is similarly set up

with these primitives. The package

allows for any model provider. The CLI

uses openthropic currently. You can see

the prompts. It's all open source. It's

using eight native tools and 11 native

tools for the package in the CLI. I'll

show those later in detail. And we also

allow for hooks at various points in the

agent life cycle.

Now these primitives that kind of make

up what we call an agent harness in

mind, what are the common techniques

that we see across different agents for

managing the problem of context rot and

of accumulating tokens from many turns

of tool calls? Well, context engineering

is kind of the broad term that captures

many of these principles. Carpathy

outlines it very nicely here. It's the

delicate art and science of filling the

context with just the right information

for the next step, which is very

applicable to agents. you're trying to

steer the agent to make the right next

tool call along a trajectory of actions.

And the three common principles I like

to distill are offload, reduce, and

isolate. So offloading is moving context

from the LM context window to something

external like a file system where it can

be selectively retrieved later as

needed. Reducing is just simply reducing

the size of context pass at each turn

and there can be a bunch of different

techniques to do that. And finally,

isolating context. So using separate

context windows or separate sub aents

for individual tasks and I share some

references here. I've talked about this

on latent space podcast. I had a webinar

with Manis where we talked through these

principles and how Manis uses them. I'm

going to review them here and also talk

about how deep agents package and CLI

employs these ideas. So first offloading

context a trend that we've seen

repeatedly is that giving agents access

to a file system is very useful. It lets

agents save and recall information

during longunning tasks.

And this is pretty intuitive. I share a

link here from Anthropics multi-agent

researcher where they basically have the

researcher write a plan, write it to a

file, go do a bunch of work, and then

they just retrieve that plan after a

bunch of sub aents did work, make sure

that everything's been addressed. So you

can just write to a file and read it

back into context when you need to kind

of reinforce the plan that was laid out.

And this is very useful to ensure that

you actually don't forget specific steps

in the plan. By externalizing it to

file, reading it back into context, you

ensure that it's persisted and that the

agent can be more easily steered. Since

you're selectively pulling it back into

the context window as needed to help

keep the agent on track.

Now another interesting thing about the

file system is often times it's

persistent across different agent

invocations. For example, if you're

running your agent locally on your

laptop with cloud code, cloud code can

always reference this cloud MD file

which can live at various levels. It can

live at the project level and also

there's a global cloud MD. This cloud MD

can store information that you want to

persist across all your different

interactions with cloud code as an

example. So manis uses these same ideas.

Of course with manis it runs remotely.

So it uses a sandbox environment which

contains a file system and gives the

agent access to a computer and it

supports user memory. Now the deep

agents package allows for different

backends. So you can use the langraph

state object which is just in memory or

you can use a file system backend for

example your local machine. And the deep

agent CLI is a lot like cloud code

running on your laptop where it will

just use your local file system as a

backend. The deep agent CLI also support

for memory using a memories directory as

well as an agent.mmd file.

The principle here we've seen repeatedly

is that giving agents the ability to

offload context to a file system has a

lot of benefits. You can persist

information during long running

trajectories and you can persist

information and you can persist context

across different invocations of the

agent in things like a cloudmd file or

an agent MD file or in the case of deep

agent CLI a memories directory. Now

another benefit of the file system is

that you can actually offload actions

from tools to just scripts. Now what do

I mean by this? We want agents to

perform actions. Let's say we want to

give an agent 10 different actions.

Often you can think about that as okay,

for every action I'm defining a unique

tool. I'll bind all those tools to the

agent. So I have an agent with 10

different tools. Now the LM in that

agent has to determine when to use each

of those 10 tools. And you also have to

load all those tool instructions into

the system prompt. So there's two

problems there. One is confusion in

terms of what tool to use. And two,

you're also bloating your instructions

with a bunch of tool descriptions. Now

look, with three or four or even 10

tools, it's not a big issue. But if we

talk about hundreds of tools, this can

be significant tokens just spent on all

the tool descriptions.

So one principle and in the webinar with

Manis, we cover this in depth is

actually keeping the function calling

layer very lightweight. So give the

agent only a few functions to call, but

make sure they're are very general

atomic functions that can do lots of

things and push a lot of the actions out

to something like scripts in a file

system. So for example, Manis gives the

agent like a bash tool and file system

manipulation tools. And with those two

things, it can just search a directory

of scripts using various tools to

navigate the file system and execute any

one using the bash tool. So with like

three or four simple tools for file

manipulation as well as code execution,

it can perform a very large number of

actions as specified by the scripts that

you give it. And so that's a way to

expand the action space of the agent

significantly while only giving it

access to a small number of tools. And

this principle we see repeatedly if you

look at cloud code, Boris Churnney and

Cat Woo, the engineering and product

leads of Cloud Co, were recently on a

great podcast. I have the link here

where they mention that cloud code is

only using around a dozen tools. And

when you're using it, you can kind of

see it uses glob grip. It uses bash. It

uses fetch to grab URLs, but it's not

using that many tools. It's only about a

dozen. Manis is using less than 20

tools. With deep agents, we actually

only have eight native tools. And with

the deep agent CLI, we have 11 native

tools. I'll show those below. Now, a

related idea is progressive disclosure

of actions. Anthropic talks about this

specifically in its recent release of

skills

and this is an interesting quote from a

nice blog post that I link here. Claude

skills are very simply a skills folder

which a bunch of with a bunch of

subfolders each of which is a specific

skill and each subfolder just has this

skill MD file a markdown file with a

header. The header just explains in very

brief language what that skill does. The

header is the only thing that's loaded

into cloud code initially and you can

see in this diagram that's exactly what

they show here.

So there's a brief snippet about each

skill available.

Now in the case of claude skills if cla

wants to use any given skill it just

then can selectively read the full

skill.md file. So again just the header

is read into the system prompt by

default. If Claude wants to actually

execute a skill, it'll read that full

skill MD file. Now, that skillmd file

can reference any other files in that

same skill directory. So, it could

contain scripts. It could contain other

files that contain more context. And so,

what's really nice is Claude with only

its bash tool as an example can just go

ahead and read the full skill MD file

and then if needed can execute any other

scripts in that same skill directory or

read any other files in. So, it's just a

nice way of progressively disclosing

actions to Claude without loading all

that into the system prompt ahead of

time and importantly without binding all

those different capabilities or skills

as tools. Remember, you're only using

for example in the simplest case the

bash tool to read the skillmd file and

then to execute any scripts in that

skill folder or read any other files in

that folder as well. So, I think about

this as a very simple way to give agents

access to different actions in a way

that saves tokens because they're

progressively disclosed only if in this

case Claude needs the skill.

And it's only using simple built-in

tools like the bash tool and maybe some

file manipulation tools. So, Manis is

using a very similar principle. The

manis agent has access to a large number

of different scripts and it can discover

those scripts using its native file

search as well as bash tools. Now, we

don't yet have this notion of skills in

the deep agent CLI, but I'm actually

working on adding that right now because

I think it's a very nice way to give an

agent access to lots of actions without

bloating its context window with

instructions

and without having to bind additional

tools. Now I do just want to briefly

make it even more crisp what specific

tools are in the deep agents package

just to highlight this point that often

we're seeing agents ship with small

numbers of general atomic tools. So deep

agents package only has basic tools for

file manipulation

a task tool for creating subtasks with

sub aents and a to-dos tool to generate

to-dos. The CLI extends this slightly

with some search tools

and a bash tool. Now let's talk about

reducing context. There's three

interesting ideas here. Compaction,

summarization, and filtering. So first

I'll talk a little bit about what Manis

does. So Manis uses this idea of

compaction. So this on the left is

showing a trajectory of tool calls and

tool results. And of course tool results

can be quite tokenheavy. Now what they

do is they just compact old tool results

by saving their full result to a file

and just referencing that file in the

message history. Now, they only do this

with what you might call stale tool

results that have already been acted on,

but it's a very nice way to reduce

tokens in the message history.

And so, this is kind of a neat diagram

that they showed. Imagine your agent's

running. It's performing many turns. So,

after some number of turns, you get very

close to the context window of the LM.

And that's when they apply this

compaction. So, they take all the

historical tool results. They're all

bloating that message history and they

compact them all down, offload them to

the file system and that brings down the

overall context utilization

significantly. The agent keeps running

and this progressively starts to

saturate and then they apply

summarization. So summarization looks at

the entire message history which

includes the full tool result messages

and summarizes it all down to much more

compact distilled summary which then the

agent can use and you can see goes

forward. One interesting point is that

this compaction step is actually

reversible because you can always go

back and look at the raw tool results

which are saved to these files. That's

another benefit using the file system.

Summarization though is not. So that is

a step that needs to be carefully

thought through because when you do

summarization you necessarily lose

information. Now you see these ideas

employed by Enthropic as well. So,

Enthropic recently shipped context

editing which just prunes the message

history of old two results in a

configurable manner and cloud code

applies summarization when you hit

around 95% of the context window. Now,

the deep agents package applies

summarization with summarization

middleware

and so that automatically kicks off

after some threshold 170,000 tokens and

it preserves some number of messages.

Of course, is all open source and

configurable.

Now, one of the other things employed in

the deep agents package and CLI is that

file system middleware will actually

filter large tool results, which is a

nice way to prevent excessively large

tool results from being passed directly

to the LM. Now, finally, let's talk

about context isolation. This is a

technique that we've seen employed

repeatedly. And this is a pretty simple

idea. Many tasks performed by an agent

can be assigned to a sub agent. That sub

aent has its own context window. And so

it can start fresh on a particular task,

particularly if that task is nicely

self-contained, execute that task and

just return the output back to the

parent agent. And that's this first

pattern shown here. And this was

discussed by Manis as well. This

communication pattern. So you have a

parent or main agent. It wants to spawn

a sub aent to do some task. It passes

some instructions to that sub aent. The

sub aent churns along and passes that

result back to the main agent. That's a

very common pattern. Now there is some

nuance here. Sometimes you want to

actually share more context with that

sub agent and actually manice allows for

sharing

the full message history that the parent

has with the sub agent. Similarly with

deep agents similarly with a deep agent

CLI the sub agent actually has access to

the same file system as the parent. So

there is some shared context between

them. So just to summarize agent

harnesses typically employ at least

three principles for managing context.

offloading, reducing and isolating. So

some of the most common ideas in context

offloading include using the file

system. We see that across the board.

Cloud code, manice and the deep agent

CLI all support use of the file system.

Enabling user memories. This is

intuitively the ability to remember

information across agent invocations.

Cloud enables it with cloud MD. Deep

agent CLI has a memories folder as a

memories directory as well as agents MD.

Manis also supports

cross- session memory. Use minimal

tools. This can significantly save

tokens in terms of tool descriptions and

minimize the number of decisions that

the agent has to make across different

tools. Cloud code uses only around a

dozen tools. Manis is less than 20. DB

agent CLI is 11. Give the agent a

computer i.e. a bash tool. All these

agent harnesses do that. Progressive

disclosure of actions. So cloud code

does this with skills. Manis does this

by basically giving the agent access to

a directory with a whole bunch of

different scripts and letting it peruse

that directory on an as needed basis

using its existing file system and bash

tools. Skills for deep agent CLI are

work in progress. Now this idea of

compaction basically pruning old tool

messages. Manis for sure does it. The

cloud SDK does support it in this idea

of context editing they call it. I

assume it's being done in cloud code but

I'm not positive. So actually should

flag I should probably flag this as

yellow because I'm not entirely sure but

I imagine it is being done.

We know for sure that the cloud code

does summarization

once you hit around 95% of the context

window. Manis does this. Deep agent CLI

does this.

And all three support sub agents for

isolating different tasks to unique

context windows.

Now the deep agent CLI is open source.

contributions are welcome and it's fun

to try to employ the these ideas in open

source harness that can be used with

many different models. So hopefully this

was a useful overview of how these

principles operate across different

popular agent harnesses and how they're

being used in the deep agent CLI. And

any questions or contributions are very

welcome. Thanks.

How Agents Use Context Engineering

LangChain

65 days ago

17:24

AI Framework Development

Rank #2

Description

This video covers the core principles of context engineering for AI agents and how they're implemented across popular frameworks like Claude Code, Manus, and LangChain's DeepAgents. As AI agents tackle increasingly complex tasks, managing context windows becomes critical. This video breaks down three key principles—offload, reduce, and isolate—and shows how leading agent frameworks implement them to handle longer tasks efficiently. 0:00 Introduction to Context Engineering 1:00 Agent Primitives & Harnesses 3:00 Context Engineering Principles 4:00 Offloading Context: File Systems 6:00 Offloading Actions to Scripts 8:00 Progressive Disclosure of Actions 11:00 DeepAgents Tool Overview 12:00 Reducing Context: Compaction 13:00 Reducing Context: Summarization 14:00 Reducing Context: Filtering 15:00 Isolating Context: Subagents 16:00 Summary & Comparison 17:00 Conclusion Video notes: https://www.notion.so/Context-Engineering-for-Agents-2a1808527b17803ba221c2ced7eef508?source=copy_link

Video Details

Category

AI Framework Development

Featured Date

December 1, 2025

Quality Rank

#2

AI Recommended