Building & Observing a Deep Agent for Email Triage with LangSmith | DailyDevLists

Loading video player...

Full Transcript

2,030 words • EN

In this video, I'm going to show you how

I use Langmith's new agentic features to

help me build agents. We're going to use

Langmith to help us build a simple

assistant that can triage and respond to

incoming emails. So, every time I get a

new email in my inbox, my agent is going

to read it and figure out what to do

with it. We'll use Langmith to help us

make sure that the agent is doing the

right things. Let's start by looking at

this agent.py file. In this single file,

we're going to define a simple deep

agent.

First, and also most importantly, we

have our system prompt. Most deep agents

have pretty extensive system prompts.

You move a lot of the complexity out of

the agent architecture and into the

actual prompt itself. This prompt has

some background information about who I

am, and it also has some criteria and

rules for how to handle incoming emails.

When an email comes in, the assistant

can take a bunch of different actions

with its tools. It can write an email

response. It can kick off new email

threads with new people, or it can call

a sub agent, which we'll look at more in

a second. It can also just mark the

email as read if it doesn't think I

actually need to look at it. Our sub

agent here is specifically focused on

interacting with my calendar. It has

specific prompting around how to find

meeting times where I'm available and

also book the meetings for me. And so

with these pieces, we can assemble our

deep agent. We have that overall system

prompt with some background of myself

and the direction of the different

things that I want the agent to do. We

give the main agent some general tools

to write emails, kick off new threads,

or mark emails as red. These tools are

all defined in a separate file, but they

basically just connect to the Gmail API

and let our agent actually take actions.

Then we have that sub agent called the

meetinguler. And this has access to a

few specific tools for calendar

interactions. Finding available times on

my calendar and then scheduling the

meetings. And so with just these few

lines of code, we've built a simple

personal assistant. What I really want

to emphasize here is that it's really

quick, easy to put together a deep

agent. But this is really just the first

step. The question now becomes, how do

we make sure the deep agent actually

works? How do we actually make sure it's

doing exactly what I want in different

scenarios? And how can I improve this

deep agent over time?

The natural next thing to do is just to

run the agent. So, let's do it. We have

an example email here coming from a

friend of mine, Oliver Queen. Oliver is

emailing me about wanting to talk about

deep agents, and he suggested that we

meet at 8 a.m. next Monday. We can take

the agent we just created and invoke it

on a single input message. This message

just says, "Hey, an email came in.

Handle it to the best of your ability."

The agent will take this input email and

call whatever tools it needs to handle

it.

Now, we do have a print statement here

that'll print out the final state output

of the agent. Let's go ahead and wait

for it to finish.

We can see the printed output of the

agent here, and it's super hard to read.

It's really hard for me as a human to

parse through this.

As an alternative to this print

debugging, I've actually set this agent

up to trace to Langmouth. All I did was

set my Langmouth API key as an

environment variable and set Lang Smmith

tracing to true. And so now when I ran

this code, all of the agents decisions

and outputs got logged to Langsmith.

Let's go take a look at the trace.

Langmith is our observability and eval

platform. One of the first things you

can do with Linksmith is set up tracing.

And clicking into our most recent trace

here, we can see exactly the input that

came in, that same email thread from

Oliver.

We can also now see all of the actions

that the agent is taking on it. One

important thing to note about deep

agents is that they have some built-in

tools, including the ability to call sub

aents through this task tool. The sub

aent can then call its own tools. So

this becomes a multi-layered, pretty

longunning process.

This view is already way better than the

print text dump of my agents outputs,

but as a human, it's still going to take

me a decent amount of time to click

through each of these steps and see

exactly what's going on. And so that's

where Poly comes in. Poly the parrot is

a new tool in Langmith that can read all

the different runs here and lets me chat

against this trace. I'm just going to

use one of the default prompts. I'll ask

to summarize this trace and we can watch

for a second as Paulie tackles this.

You can see that Paulie is reading

exactly what went on, listing out the

runs. Uh, and eventually, here we go.

Polly has given us a nice summary. We

can see that we received an email

request from Oliver. We delegated this

to the meetinguler sub agent. We checked

calendar availability for me

specifically on Monday. We saw it was

free, relatively late, and so we sent an

email response confirming the

availability and accepting Monday as

that time slot.

Cool. But to throw a wrench in this,

let's say I actually didn't want the

agent to do this. Let's say truthfully

that I don't love waking up early and I

don't actually want to take any meetings

before, let's say, 9:00 a.m. Eastern

time. So, how do I actually go about

fixing this? If I navigate back to VS

Code, there's a tool that we have called

Langmith Fetch. It's just a package that

you can install. And when you run Lang

Smmith fetch, you need to set the config

to pull traces from a particular

project. So for us, I'm actually going

to go back to Langmith real quick. I'm

going to grab the ID of our personal

assistant tracing project, and I'm going

to paste that when I set it in the

config of Langmith Fetch.

Now, when I run Langmith Fetch, it's

just going to pull the most recent

trace. We can see in a moment the full

message history, the email that came in,

what our assistant thought to do, it

called a tool, it did its analysis,

confirmed it works, sent the response,

and then marked the email as read.

Now, you might be thinking, why is this

even useful, right? We just saw a much

more detailed version of this in the

Langmith UI. The benefit here is that by

exposing this in the terminal, we now

allow our coding agents to actually use

this information. The coding agents can

call linksmith fetch themselves and see

what happened in the most recent traces.

I'm going to show you how you can do

this with our deep agent CLI. I'm going

to run our deep agent CLI. This kicks

off our coding agent. And if you haven't

tried the CLI before, it's pretty

similar to Cloud Code. The experience

should be pretty familiar.

I'm just going to ask this. Hey, can you

tell me what happened in the most recent

trace? Summarize it for me.

The agent's thinking and I've actually

given the agent some instructions

already in a markdown file about how to

approach this stuff and that it can use

lang fetch. So we can see the agent now

wants to run lang fetch and I'm going to

let it do it.

Now lang fetch traces are executing in

the background and in a moment we'll get

a nice summary from the agent. It says

here this short summary uh this is what

triggered it. The meeting was accepted

and confirmed and great.

So now instead of manually going and

changing the prompt myself, I can also

just talk to the coding agent. So I'm

going to say, "Hey, I like sleeping in.

I don't actually want to take early

meetings, and can you make sure that in

the future the agent only accepts these

meetings after 9:00 a.m. Also, while

you're at it, write a test to make sure

that it remembers this successfully."

And so we've given it a few tasks here,

but because I just ran lang fetch, we

have that information in the context

window.

And so the agent is going to work for a

while. It's going to list out files.

It's going to read some of the files I

have locally. And then it's going to

start making some edits.

So here we go. The agent has come up

with a recommendation. It wants to edit

agent.py. Specifically, it wants to edit

the prompt. I'm going to accept this. It

looks good to me.

Now, it's going to read our test

assistant file and add a test.

It comes up with this nice test, very

similar to the example that I just had.

And so I'm going to improve this by

adding this to test assistant.py.

Now, as a sidebar real quick, I want to

talk about why I've chosen to write

these tests in piest. Deep agents can

handle a variety of tasks and the

success of handling these different

tasks can be measured in a lot of

different ways. Writing in piest or or

viest for JavaScript gives us the

maximal flexibility.

I can in these tests assert that a

certain tool call was made in a certain

scenario. I can also in the same test

use an LM as judge to check that my

final result or my final email in this

case followed some specific criteria.

I've really found that when writing

tests for deep agents, it's nice to have

that flexibility to assert whatever

you're specifically looking for for a

particular given input. And both viest

and piest afford a lot of flexibility

here.

So now let's run the test. And the

beauty of this is that in running this

test, we're also logging more

information to Langmith. And so after we

run this test, whether or not it passes,

we can ask the agent to list out this

trace with Langmith fetch. See what just

happened, see if it was acceptable. And

if it fails, we now have this

programmatic agentic loop where the

agent has a clear reward function. It

can keep iterating on this prompt until

this test passes.

And so just to rehash the different

things that we covered in this video,

deep agents super easy to get started

with. It's really easy with just a

prompt and a few tools to come up with a

pretty powerful agent. But that is

really just the first step. From there,

it's really important to figure out if

your agent is actually doing what you

want it to. The best way to do that is

by setting up tracing to Langmith so you

have full visibility.

In Langmith, you can talk to Poly and

ask Poly questions about what your agent

actually did during execution. And this

is just a quick way to speed things up.

I've seen deep agents that can go for

hundreds of turns, takes several

minutes, and that can be a pain to walk

through yourself.

Then we have linksmith fetch, which

works in the terminal to pull trace

information in. This is really powerful

when you make it accessible to coding

agents.

And so these tools are all intended to

make it really easy for us as developers

to work with AI while building deep

agents. You can try out Poly and

Langmith Fetch today. Let me know how it

works for you. Pix.

Building & Observing a Deep Agent for Email Triage with LangSmith

LangChain

77 days ago

10:32

AI Framework Development

Rank #1

Description

In this video, we walk through how to build and observe a deep agent using LangSmith. We’ll build a simple email assistant that reads incoming emails and decides how to handle them — triage, respond, or take action — using a prompt-driven approach. You’ll learn: How to define a deep agent in a single file • Why most agent complexity lives in the system prompt (not the architecture) • How to encode rules, context, and decision criteria into prompts • How to use LangSmith to observe, validate, and debug agent behavior This walkthrough is useful if you’re building longer-running agents and want confidence that your agent is doing the right thing at each step. - Learn more about LangSmith: https://docs.langchain.com/langsmith - Learn more about debugging deep agents: https://blog.langchain.com./debugging-deep-agents-with-langsmith/ - Learn more about agent engineering: https://blog.langchain.com/agent-engineering-a-new-discipline/

Video Details

Category

AI Framework Development

Featured Date

December 21, 2025

Quality Rank

#1

AI Recommended