Loading video player...
Hi everyone. Thank you so much for
joining us for our next session of our
Python and AI levelup series on agents.
My name is Anna. I'll be your producer
for this session. I'm an event planner
for Reactor joining you from Redmond,
Washington.
Before we start, I do have some quick
housekeeping. Please take a moment to
read our code of conduct.
We seek to provide a respectful
environment for both our audience and
presenters.
While we absolutely encourage engagement
in the chat, we ask that you please be
mindful of your commentary, remain
professional and on topic.
Keep an eye on that chat. We'll be
dropping helpful links and checking for
questions for our presenter and
moderator to answer.
Our session is being recorded. It will
be available to view on demand right
here on the Reactor channel.
With that, I'd love to turn it over to
our speaker for today, Pamela. Thank you
so much for joining.
>> Hello. Hello everyone. Welcome to our
Python AI series where we are learning
all the fundamentals of building with
generative AI in Python. So, we've
already had seven sessions in this
series. If you missed any of those
sessions, they were all recorded and we
have a page that lists all of the
recordings and the slides and the code
samples so that you can catch up at any
time. Those should be available forever
as long as the internet lasts. Uh now
today we are talking about agents and
then tomorrow we're talking about MCP.
So those are our final two topics in the
series. Uh they're both pretty exciting
topics. They're new topics h you know
really in the last year. So we're pumped
to be talking about them for our final
part of the series.
So now let's talk about agents. If you
do want to download the slides uh you
can see uh they're available from this
URL. You can follow along with the
slides. You can share the slides. You
can give this talk yourselves to your
own communities. Whatever you want to do
with them.
We're going to be talking about a few
different AI agent frameworks today. We
are going to be focusing on Microsoft
agent framework, but we're also going to
be showing some ling chain. Uh those are
some of the the two biggest uh
frameworks we want to talk about and
really but using them as examples of how
generally to build agents, right? So
even if you're using a different
framework, uh today's topic should be
helpful for you. So we'll look at
different agent architectures, you know,
multiple tools, multiple sub aents,
workflows, that sort of thing. Uh how we
can add humans in the loop, how we can
do planning, memory. We're going to try
and touch a little bit on all these
topics. Now, we only have an hour, so we
can't go super deep into these topics.
Like really, we need a whole series just
about about agents. uh but we want to
give you an you know overall
understanding of the space for when
you're building agents.
If you do want to follow along with the
code samples today, this is the repo
we're using. Uh this is a new repo that
we haven't used before in the series. Uh
so what you can do is we'll put this
link in the chat and then you can click
through on it. So I'll go ahead and
show. So, I'm going to click through and
that'll go over to the the GitHub here.
So, here's the actual GitHub repo. And
on that GitHub repo, you're going to
click on the code button here, and
you'll see two options, local or code
spaces. Uh, you can set it up locally if
you want. If you do set up locally, you
do need to read the readme to see how to
configure your environment variables.
What's easier is to set up in code
spaces where everything is set up for
you and you don't have to configure
anything about the environment.
So, uh what this does is opens up a code
space. A code space is a VS Code inside
your browser that has all of the code
from this project set up and has Python
installed. It has the packages
installed. So that once everything
loads, we should just be able to run the
examples from the repo.
So you can give that a few minutes to
set up and then I'll continue talking
while it sets up.
So first of all, we have to define
agent. And this has actually been very
very hard to define over the last year.
Like we for the last year, we've all
known that agents were a big thing, but
we've had a hard time deciding exactly
what an agent was. And we've had a few
definitions over the last year. Uh but
I'm happy to report that we now have an
official definition thanks to our good
friend in the Python community, uh Simon
Willis, uh who has been collecting agent
definitions, uh over the past few years.
And this is the current definition that
we're going with. An LLM agent runs
tools in a loop to achieve a goal.
And I like this definition because what
you'll see is that in the Python agent
frameworks, this is pretty much how the
Python agent frameworks think of agents
as well. Is that we'll construct an
agent, we'll give it a goal, and we'll
give it tools. And that is the essential
agent that we're going to see over and
over throughout all of our Python
examples. Uh so so yes so this is the
definition we're going with today is
that an agent runs tools in a loop to
achieve some goal. Uh some of these
tools might be research tools. Some of
them tools might actually take action
and and change things about the system.
Some of the tools could like sense
things in an environment. The tools
could be all kinds of things. But the
thing is that the you know we're giving
the agent the ability to decide what
tools it's going to call. Figure out the
most appropriate tools. figure out, you
know, what parameters it's going to call
those tools with and then, you know,
keep on iterating until it's used all
the tools that it thinks is necessary
and has reached the goal.
Now, there's often additional features
associated with agents like memory and
planning and human in the loop and stuff
like that. So, we'll look at that as
well. Um, but I'll consider those to be,
you know, bonus features.
So how do we build an agent using Python
here are some of the top agent
frameworks uh that we're we can take a
look at right so the one we are focusing
on today is agent framework uh this is a
framework for Microsoft it is really new
it just came out in the last month and
it is basically the successor to autogen
and semantic kernel so if any of you
used autogen or semantic kernel in the
past basically Microsoft took like
everything that people liked about those
two frameworks and combined them into a
single framework that that has
everything and also just has really good
support for you know Azure AI models as
as well as other models too. Uh so it's
a framework that that really uh has
support for quite a few things. Uh so um
you know it's it's a really solid
framework and uh I found it to be pretty
easy to use uh just over the past month
because you know it hasn't existed for
very long but I've been able to do a lot
with it uh in you know in the its short
existence.
Uh we are also going to look a bit at
lang chain uh langchain v1 actually just
dropped today officially. It's been in
alpha for the last few months and today
is a day that it's uh you know like
officially you know not alpha anymore.
Um so Langchain v1 is lang chain's new
so we've had lang chain around for quite
a while. It was basically the first
agent framework. It came out years ago
and it it's gone through a lot of
iterations. Um they've had multiple
versions of it. uh but they had so many
users of it that they like really
learned what people wanted from a
framework and and you know what they
needed to make a framework that would uh
that people would be able to use for the
most number of scenarios and that would
be simple to use for the most number of
scenarios. So that's what lingchain v1
is. It's the you know new version of
lingchain that it's agent centric from
the start and it's designed to make it
easy to do what most people want to do
with agents. Uh there's also Pyantic AI
that's from the Pyantic team that's
specifically uh designed to work well if
you care about type safety and also you
know using like open telemetry
observability and and standards. Uh
they're big into adhering everything to
open standards. And then there's OpenAI
agents which comes from uh OpenAI. So
those are some of the Python AI
frameworks. There are so many I couldn't
yeah I can't fit them all in this in
this slide in the repo and today um but
you know these ones are all good ones to
know about and if you check the repo you
will find examples from all of these
frameworks in the repo. So if you're
interested in something uh you can you
know browse through the repo to see how
you can use that framework.
So first we're going to look at the
simplest agent architecture would be a
agent with a single tool. So the how
this is going to work is that we have an
LM. The LM receives a request from the
user. The LM calls the tool get backs a
response and then based off that tool
response and the original user's request
says oh okay I know I know how to answer
your question. Here's the response.
Right? So for example, if the user says
how's the weather in SF and the LM has
this tool get weather and that tool
accepts, you know, the city name as
argument, the LM's going to say, okay, I
should call get weather with San
Francisco as a city name and then, you
know, it gets back some information and
then it gives a nice, you know, user
friendly response to the user, right? So
let's actually try that out, right? So,
going back over to our code space here,
and I'm going to open up the examples
folder and go to agentframework
tool.
And I'll make that bigger.
And let's see what else can I do to make
it more visible, bigger. Okay. All
right. So, here we've got the code for
it. So at the top we have a bunch of
imports uh from the agent framework.
Uh then you know we set up some logging.
Uh then we configure the OpenAI client.
Uh so all of the examples have the
ability to connect to Azure OpenAI to
GitHub models. That's what we're using
the code spaces because it's free. So
it'll it should just work. Uh to Olamo
if you wanted to use local models they
they would have to have really good tool
calling support. Uh so many of them will
not work actually um because tool
calling is hard for local models uh or
just to openi.com right so we configure
our connection to a LLM provider then we
define our tools so in this case we're
only going to have a single tool which
is get weather so this get weather
function takes in a city and that is a
string argument and we also give a
description of the argument so we say
okay this you Now this string is the
city name spelled out fully. Right? So
generally when you define functions that
are used by an agent framework, you want
to give as much information about the
arguments of that function as possible
because all of this is actually going to
get sent to the LM. So the LM's going to
say, oh okay, get where there's a
function. It has a city parameter and
that city parameter has this description
and here is the description of the
function. So all of this gets sent to
the LLM. So you want to put as much
information as possible in your function
signature.
Then uh we have just a you know a jank
implementation here. In reality you know
we would hook this up to an API to get
back the actual weather for San
Francisco.
And then we construct the agent here. Uh
so let me just put this on a few lines
so we can see it. Uh so we create the
agent and we pass in our our client
that's connecting to the model provider.
We give it the overall instructions.
This is the system prompt. And then we
give it a list of tools. So in this
case, we're only giving it a single
tool. So now that that's all set up, we
can finally run the agent with the
user's question. So we're just going to
say, okay, how is the weather today in
SF? So we run that
and we're going to watch to see what it
decides to do.
Is it going to call our function? So
yes. So what we can see is there's a
logging a logging statement here that
says getting weather for San Francisco.
So it called that function with the
string San Francisco. Right? So that the
LLM decided to turn this lowercase SF
into this expanded string San Francisco.
So that's the kind of thing that LLMs do
when we give them tools is they they
decide on the arguments. So they can do
some cleaning up along the way. Now when
they're doing that cleaning up, they're
making a lot of assumptions, right? They
assume that SF means SF California,
right? Which works in my case. That is
what I meant. Um but you know, depending
on the context of the user and where
they are, maybe there's SFS in Europe or
other places. Uh I know there is
actually San Francisco's in like like
Mexico, right? So we said like um you
know if we if we asked it like really
you should have a get weather function
that takes more than just city right it
should take you know latitude longitude
country all that. So you do want to
think carefully about you know what
arguments your your tools expect and um
you know whether your agents can
accurately call that with the right
arguments. Uh but in this case it did a
good job uh based off the tools we gave
here.
Uh so what's happening be behind the
scenes is that when uh this call goes
out the agent framework is turning this
function signature into JSON schema. So
for those of you who joined us yesterday
for tool calling what we saw is that in
tool calling we have to give a JSON
schema of the function. So you can
imagine that behind the scenes the agent
framework is taking this function
signature and inspecting it and then
rewriting it into a JSON schema and
that's what gets sent to the LLM.
All right. So there we had the single
tool example.
That's just to show us, you know, how
how this works.
Now what we're usually going to use is
multiple tools. That's much more
interesting because there the LM has to,
you know, pick from the tools whether
it's going to use any tools at all,
which of those tools going to use, um,
and what order to call the tools in,
right? Because sometimes the order does
actually matter because maybe one tool
has information that the other tools
needs.
So imagine we have um, you know this one
here, the LLM has get weather, get
activities, and get current date, right?
If get weather and get activities both
need to know the current date, then we
have to call get current date first to
find out the date that we're going to
pass into those functions. So let's see
what the agent decides to do.
Uh so we'll open up agent
framework_tools.
py
and uh here we've got our imports as as
usual. Um, and now we've got the get
weather function.
We have get activities. So, get
activities. This one is taking city and
date. And you can see that with the
date, we also specify what format we
want it in, right? So, get activities
needs to know both city and date.
And then get current date doesn't take
any arguments. It just returns back the
date.
So then we say okay you help users plan
their weekends and choose the best
activities for the weather. Uh include
date of the weekend in the response and
we give it get weather get activities
get current date. Now it doesn't matter
what order we give the list in here
because the LLM is just going to decide
you know based off the user's question
what order it's going to call those
tools in. So then I said like okay what
can I do this weekend in San Francisco?
Let me change that to SF. Um, so it's
going to go and look at that query, look
at its tools, and what you can see is
that it immediately called get current
date. Uh, it realized it needed to know
that before it could really do anything
else. So, it got the current date, then
it got the weather, then it got the
activities, and it called get activities
twice because a weekend is two days, and
it realized it had to check for both
October 25th, which uh is correct. So
let me just like double check. Uh
today's date is October 22nd. Yeah,
October 25th and 26th. So it looked at
today's date. It realized that the
weekend is the 25th and 26, right? So it
did that calculation correctly. So it
called get activities twice, one for
Saturday and one for Sunday.
Uh so so that's perfect, right? It did
everything in the right order in order
to be able to get the right information
to answer this question.
and and then we get back the final
response and to actually answer the
question here. It says since it's going
to be raining then you should visit a
museum.
All right. So that that's working well.
Uh I do see a question about why we're
using async defaf. Uh so these are async
functions that also called co-outine in
Python. And you're going to see it quite
a bit when using agent frameworks. A lot
of agent frameworks are what we call
like async first or even async only. And
that's because it's generally going to
get you much better performance if you
use uh async when you're uh calling
LLMs. Um I have a blog post about that
we can link to uh so that you can read
more about it. Um there we go. Why your
LM powered app needs concurrency. So uh
you can read more uh about it in that in
that blog post. Uh but for Asian
framework it is uh it's it's I would say
it's like an async only async first
framework. So we're generally going to
be using a lot of co- routines when we
are using uh agent framework.
Uh but that's a good thing to to take a
look like some frameworks will support
both async variants and uh non-asing
variants and uh some frameworks were
just you know always assume you're going
to use async.
All right.
So now let's look at a more advanced
architecture. This is what we call the
you could call this the supervisor
architecture. Uh so in this one you have
a supervisor agent and it decides which
specialist agents should handle a task.
Uh so this is a good one if you have a
more general uh agent that people are
using like just for lots of different
things like basically like chatb is like
a supervisor right like it has all these
different you know there's lots of
different things you can do with chapd
and it can just uh delegate to you know
to a sub agent right so what we're going
to do is figure like okay here's a
supervisor agent here's this like
generic task coming in let's delegate to
a more specialized agent right uh so in
this case we have a supervisor agent
which uh can delegate to a weekend
planning agent or a meal planning
planning agent and that uh the weekend
planning agent has tools to search
activities and check weather just like
the ones we've been doing. The meal
planning agent has very different tools
for check fridge and search recipes.
Right now, you could just make a single
agent that has those four tools, right?
and that could work. But generally, one
of the issues you'll see when you're
designing like agent application is that
if you throw too many tools at an LLM,
it can have a hard time deciding which
ones to use. So sometimes people will
use this architecture of basically like
making sub agents and so that you know
the LM just has to decide okay is this a
weekend planning task or is it a meal
planning task, right? And if it can
figure that out, then it doesn't have to
worry about the exact tools. So this
works well if like the tasks are
distinct enough. It's not going to work
if there's like a lot of overlap between
the tasks and you like, you know, need
to mix and match the tools. But if
there's very distinct tasks, then you
can use this sort of architecture.
So let's take a look at how we actually
implement this in um in agent framework.
So we're going to go to agent framework
supervisor. PY
and uh so for this one the way let me
scroll down to the bottom here to show
how we implemented this. Okay so here's
the supervisor agent says you're a
supervisor managing two specialist
agents uh weekend planning agent and
meal planning agent break down the
user's request decide which specialist
to call maybe both and then synthesize
the final answer and then we just give
it the agents as tools. So this is a
common way to implement this is to
actually just wrap a sub aents as tools
and you know because LM we already have
handling for tools right and so then
what you can see is this this plan meal
tool takes in a query and then it just
calls an agent and gives back the
response from the agent right and so
then we have the actual meal agent and
that's got its own tools find recipes
and check fridge
there's their tools and then we have the
other uh tool that calls the weekend
agent and returns a response from the
weekend agent
and uh you know here's the definition of
the weekend agent which has the three
tools that we've seen before. So that's
how we're going to implement supervisor
is by wrapping those agents up as tools.
And this might seem kind of like weird
at first like oh why do we have to wrap
them up as tools? But it turns out like
it's just helpful to think of everything
in terms of tools if you can when you're
coming up with the architecture if you
just think of it that way because that's
the way the LLM is going to think of it,
right? LLM's only really know about
tools. So if we can you know figure out
how to map our architectures to tools
then you know then we can make anything
happen. So here we can say um I asked my
I said my kids want pasta for dinner.
So, it said, "Okay, we're going to use
the plan meal tool, which calls in the
meal planning agent, and that looked for
recipes." And that checked the fridge,
and then we got back the responses.
Right now, we could change to something
else, like, "What the heck do I do with
kids this weekend in SF?" This is also
my like parenting helper agent here. Um,
which is like these are the two main
main things that I need to do. All
right. So, this time we got the plan
weekend. uh tool invoked. So that sub
agent and that one got the date, it got
the weather, it got the activities and
we get the suggestion back from the
agent here. Uh so I see a question like
does the supervisor agent consult an LM
in the background? So the supervisor
agent is an LM, right? So if we look at
it here, we can we can see that it
actually is an LM, right? So it is, you
know, it's got its system prompt here
and uh it's making a call to the LM with
this system prompt and these two tools
and that LM is suggesting like oh call
this or call this, right? Um
and we could even try let's just try
let's go hard, right? And what should I
make for dinner um tonight? All right,
we'll just let's see, right? So, if we
give it, you know, we're just a really,
you know, we're really keen on getting
some help here, getting all of our
questions answered at once, right? Um,
so it decided to invoke both the tools.
Um, and I think it actually decided to
do it in parallel because it's invoking
both the tools and, uh, yeah, and so we
got back all the responses, recon
activities and dinner suggestions,
right? Uh so this can work really well
for um you know for when you have these
distinct tasks which can be uh thought
of as sub aents
and then you can like really and this
allows you to get a lot fancier with the
tools and the sub aents and not worry
that you're distracting the main LM
right like this really helps with
keeping the main LLM focused on
distinguishing between the overall tasks
and giving you the flexibility to add
all kinds of tools to these LLMs.
and uh yeah, not worry about distracting
the main one.
All right. So, uh so there's a super I
see there's a lot of questions following
up on on uh on this one, right? So,
questions like, oh, can we pass a
document defining all the agents? I
mean, ultimately, you you know, it's
it's just whatever you're putting in the
LLM call. So, you know, you can you can
have a much longer system prompt, right?
So I do know some people that will
actually
they will describe for each of the
agents what tools they have and so
they'll actually you know list all the
tools from each agent and so you can
take that approach if you find it's
helpful right it's it's all about what
you're going to what are you going to
put in your system prompt and what is
what are you going to put in your tool
description right because you can see
remember also this is going to get sent
to the LM right so you can also but do a
much longer uh function description and
this will get sent to the LM so what's
going to get sent to the LM is the
system prompt and the function
signatures. And so you have to decide
what you're going to put into that to
really help it make a decision. And you
know, you can start simple and then
evaluate and see where it's getting
confused and figure out, you know, what
you need to where you need to add more
specificity in order to get a better
response.
All right. Now, I see um a question
about about lang chain. All right. So
this is a good point to bring in the
other frameworks, right? So these were
all with lang chain. Okay. So let sorry
these are all with agent framework. Yes.
Let me open up like the tools one,
right? This is like our classic uh you
know chat an agent with multiple tools.
So this is with agent framework. So what
I'm going to do is open up the side by
side with lang chain because what I want
to show you that is is that it's really
building agents across these frameworks
is very similar. So this is a side by
side. So what you can see is that in
langchain v1 I call create agent. I pass
in you know my model which is like my
client right these are equivalent. I
pass in my system prompt which is same
as instructions and then I pass in my
tools which is exactly the same as this
one right same list and then we can look
at the you know tool definitions right
so the only difference here is that in
langchain there is an at tool decorator
which helps lang chain know what the the
which functions are tools in your
program. Uh everything else is the same
there, right? Um so what you can see is
that the lang chain code and the agent
framework code look really similar. Now
let me also just go ahead and bring up
um pedantic. Uh let's see. I've got
pedantic here too, right? Um so what
happens if we look at All right. So
we've got that one. We've got that one.
Okay. And now here's here's pedantic.
Let me really try and line them up here.
Okay. So pedantic we pass in the model.
We pass in the system prompt. and you
pass in the tools. Do you see how close
these look to each other? Right? There
are some differences. Uh but they all of
these agent frameworks have the same
idea of what an agent is, right? An
agent is connecting to a model from a
model provider. It has a system prompt
and it has tools and then it's going to,
you know, call those until it gets a
final output. Right? Here's where we
call them and we get back the final
output. So
uh you know all the examples that I
showed you can build those same examples
in your in your favorite framework and
you're going to see the code is really
similar. So at least for all the
examples I've shown so far the code is
nearly the same across the frameworks.
uh which I think is great because you
can you know you learn one framework and
then you can apply your learnings to the
other frameworks and it becomes easier
to think between them because you know
increasingly at companies you know you
might actually have different projects
that use different frameworks and it's
helpful to be able to understand the
multiples
and now I get the question of course
which framework should you use versus
other frameworks you know I want to be
equal opportunity here I do actually
think they're all great frameworks um I
uh you know I would say market Okay, so
agent framework comes from Microsoft. So
if you are like a Microsoft heavy shop
and you're already using lots of Azure
things, you're using Azure AI, you're
using Azure AI foundry, um you're using
the agent service, I think it makes a
lot of sense to use agent framework
because it's going to have support for
the Azure stuff, you know, sooner than
most other frameworks, right? Because,
you know, it's it's for Microsoft, like
we're in the know, like it's really easy
for us to get features in for Azure
stuff. So, if you're a heavy Microsoft
shop, um definitely makes sense to use
agent framework. Even if you're not,
it's it's still it's a solid framework.
But I'm saying if you are a Microsoft
heavy shop, uh I would definitely
consider agent framework to be your top
uh a pretty top choice there. Lang
chain, what's great about lang chain?
Huge ecosystem. Many many people using
it. Uh they just got a a bunch of
funding yesterday. So, they also have a
large engineering team behind it that's
adding more features. Uh they have a
observability platform that's quite nice
called uh Langmith. I have it actually
up over here. We'll see it later. Um, so
they've got some great observability
there. So, good ecosystem, uh, big team
and, uh, you know, Langchain v1 is a
nice solid API, right? Uh, I to be
honest, I didn't like the previous
versions of Linkchain. Uh, but I do like
the I do like the new version. Uh, so it
yeah, solid solid agent framework and
you've got a a big community ecosystem
there. And then Pantic AI, why would you
use that one? That's coming from their
pedantic team. if you're already using a
lot of pideantic or if you're already
using logfire
um and and if you particularly really
want type safety um open standards like
open telemetry
uh then pantic AI can you know be a
really good fit there. Uh so that's
that's you know one way of comparing
them. Uh hopefully that gives you some
idea of how to decide, but honestly I
think they're all good picks and uh you
could you could start with any of them
and and uh have a a good experience with
it. Uh I will like I'll I guess I'll
warn like agent framework is pretty new
so you you may need to file some issues
if you're looking for examples that you
can't see yet, right? The team is very
rapidly building more examples out for
it. Langchain v1 itself is quite new.
So, Langchain is is out there and it's
been there for a long time, but you may
get a little confused when you're
looking for examples of how to use
Langchain V1 because that's the new
version. So, for Lang Chain B1, go to
their documentation first. Don't search
the web. Go to the docs first. The new
docs are quite good. You're going to
have to start from the docs because if
you search the web, you're going to find
outdated stuff. Uh, with Pantic AI, you
can search the web because they they
haven't had such big changes across
them.
All right. So, okay, good. Hopefully
that helps. Now, we're going to start to
see ways in which uh these ways in which
things do start to diverge a bit across
the agent frameworks when we get into uh
more sophisticated or complex
architectures.
So, let's talk about workflows. Uh many
of us are actually building not just
agents but agentic workflows, right? And
I'm going to define this is my
definition but an agentic workflow is
some sort of you know program some flow
that involves an agent at some point uh
and that that agent is usually done
doing either some research or decision-
making or answer synthesis right so it's
something like in the past we might have
used like some like jank regular
expressions and but now we have an LLM
uh that can you know make a better
decision for us right um so when you add
you When you make a dententic workflows,
then you can have much more powerful
workflows because you have the
capability of an LLM. But it does also
mean that your workflows are
non-deterministic, right? Because you
have an LLM and LLMs are difficult to
predict. They are non-deterministic,
right? So remember, every time you add
an agentic call, a call, anytime you add
any sort of call to an LLM into a
program, you are increasing the
non-determinis of a system. you're
increasing the risk of a system, the
possibility of failure, and so you need
to do it really carefully and make sure
that this is definitely a place where
you want to call an LM because you're
getting more, you know, more benefit
than drawback.
So, how do we build workflows in agent
frameworks? Uh because it's such a
common thing for people to build
workflows like for uh you know like
background tasks uh processing uh there
is built-in support for many frame in
many frameworks. Uh so in agent
framework they call it workflows. So
it's quite easy to find if you search
for workflow uh and they use a class
called workflow builder. Uh and there's
also special builders for common types
of workflows. And there is a
visualization tool that you can use to
visualize your workflow called debi uh
which we'll see today. Uh in lang chain
v1 you would actually use lang graph. So
lang Jane being one is built on top of
lang graph which is a package they've
had for a while for doing these
graph-based flows. So you if you wanted
to build a workflow you would um you
would potentially want to go straight to
lane graph and and make a graph in lane
graph. So I will show an example using
lane graph later today. And then pantic
uh also has a graph and they call it
pidantic graph right and you could um
visualize that with some diagrams.
So interesting the you know all the
frameworks do like to support the idea
of graphs or workflows and they also
often have special ways of visualizing
those and making them easier to
understand and debug.
So let's look at how we do it in agent
framework. Right? So we're going to do
this workflow here. And this workflow
actually uses a lot of LMS. Um it's a
very you know language heavy workflow.
So we have we start off with a writer
which is going to write a first draft of
something. Then we have a reviewer which
is going to review it and say how good
it is and if it is good enough it'll go
to the publisher which will just format
it. Uh and if it's not good enough then
it'll go to an editor. So this is what
we'd be calling a conditional branch
right I mean and you just think about as
code it's just like an if statement
right? So if the reviewer says that the
um you know that the quality is not high
enough it'll send it to the editor and
the editor will you know fix that right
up and then once it's good enough send
it to the publisher. Right? So when we
build the workflow, we're basically
building a graph. We're saying here's
our start node. This this node goes to
this node and this node conditionally
goes to either this one or this one. And
you know, here's how you decide, you
know, the the function used to decide
which branch it goes to. And then, you
know, here's where it finally ends up.
Um, so you could write pretty similar
code in with the graph frameworks from
the other frameworks as well. All right,
so let's go and check this out. Let's
see. agent framework
workflow. Okay. All right. So now uh
we're bringing in the workflow builder
class
and uh let's see we're using some
structured output. It's very common to
use structured outputs within workflows
as well. That's a nice thing to use. Um
we've got functions here that are
deciding you know which branch to go to
like okay does it need editing uh or is
it approved?
Uh and then here is our main agent, the
writer agent says you're an excellent
writer. We have the reviewer agent uh
which says you know you're a reviewer
you're going to review it and u return
back a structured output and that
structured output is you know going to
have score feedback and clarity. Uh so
this would be a common approach is that
to you to have an LLM that has
structured output and to to use you know
one of the fields that comes back to
decide what to do next right if the
confidence score is high enough we're
going to do X if the confidence score is
too low we're going to do Y right that's
a really common pattern then we have our
editor agent that was going to edit you
know poor shoddily written things and
then finally the publisher to do
formatting oh and then finally the
summarizer uh which just summarizes
everything that happen. All right. And
then we build the workflow and uh and
then finally we can run it. So we're
going to run it using the Dev UI tool
which is a development tool for um for
working with workflows. And
uh let's see. Oh, did we just upgrade?
Okay. All right. I'm going to I think I
just upgraded the agent framework
version and let's see if I can get it
worked an hour ago. They might have just
done a release. Okay. All right. So,
this is true. We do need to install this
extra. Um I put it at the very top here.
So, I'm going to install this devi. Um I
didn't have it installed by default in
environment because it does bring in a
lot of dependencies. All right. So, I'm
going to install that.
Okay. All right. Let's see
if it's going to work here. Okay. So,
opening up with um the Wii. Oh, let me
get this Ashley opening here. Okay. All
right. So, here it is. Let me change
that mode. Okay. All right. So, this is
Debi. This is a development tool that we
can use in order to test our workflows.
It's not a userfacing tool. It's not
designed to be a userfacing tool. Um,
it's just supposed to be a developerf
facing tool. That's why it's called dev
UI, right? So, let's just test it out.
Be like, okay, write a paragraph
um
about the
um
uh limitations of the
uh or okay, we'll just do the origins
of deos muertos
holiday. Okay. All right.
So, we send this off to the writer first
and uh it's going really fast. Um you
can kind of see it's popping up at the
bottom and it went all the way through
already and uh so we can look at each
point, right? We can see what the writer
said. So, the writer wrote this
paragraph here. Then it went to the
reviewer. The reviewer returned back
accuracy of 90, clarity of 85,
completeness of 80. It gave some
feedback
and uh that was good enough to to go off
to the publisher uh because let's look
at what our conditions were. Our
conditions were that uh
okay review score greater than 80,
right? Um so that got a review score
greater than 80. So it went to the
publisher who just added some formatting
and then the summ summarizer here.
Right. Let's see if we can get it to
have a Okay. Right.
a shoddy paragraph about origins of
Halloween.
Okay.
I don't know if it's going to evaluate
it based on whether it's shoddy enough
or
Oh, good. It went to the editor. It
worked. Okay.
Uh so, uh it went to the editor. Um so,
now we can look and see. Okay. This time
the reviewer gave it a score of 78 which
meant it was uh it was under that bar of
80. So that went to the editor and the
editor made some edits and then took it
sent it to the publisher. Uh in this
case it didn't go we don't this workflow
doesn't go back to the reviewer. You
could do that as well but in this case
we trust the editor and we send it to
the publisher. Uh I see a comment that
it's similar to N8N uh which is like a
workflow builder. So I think it looks
similar to NAN but one thing to keep in
mind is that this is just a
visualization. I'm not doing any sort of
editing here, right? So I'm not like
editing these things and having it turn
into code. I am just running it this way
and then seeing the results, right? So
it is a visualization tool for seeing
what is happening which is really nice.
It's very nice and visual and and
helpful to see all the uh you know all
the things the events that go off here.
Um, but it is not a like low coal low
code workflow building tool because I
can't actually do any sort of editing of
this actual UI here.
Uh, so I think it's actually more
similar to langraph studio, right? I've
got langraph studio open here and like
once again uh this visualizes the flow
of a graph but you know we can't
actually edit the graph, right?
All right. So that is a that gives you
the idea of a workflow and and how you
might work with them.
Um, now let's talk a bit about human in
the loop. Right? If you are building
workflows with agents, you very often
want to involve humans in the loop at
some point, especially if your agents
are taking any sort of like action in a
system, right? If they're changing
something about the system, right? If
they're writing an email, if they're
making a comment, right? like if they're
taking actual actions on your behalf,
then you often do want to add a human in
the loop because you really want to
trust that the results are high quality,
right? Because, you know, LLMs are
impressive with how much they can
achieve. But remember, you know, they're
nondeterministic. They can hallucinate.
They've got a lot of risk to them. So I
really recommend finding ways to bring
humans into the loop in your agentic
workflows uh so that you have a lot of
confidence in uh you know in what the
agent has suggested it should it should
write and and you know and and thumbs up
or thumbs down it.
So I have uh a agent that I built in
lang chain with langraph um that where
I'm the human in the loop and so I'm uh
I'll demo demonstrate that one right. So
this is a agent that helps me triage the
issues in our most popular uh open
source samples and which they've got a
lot of stale issues and a lot of them
can be closed. So what this workflow
does is that first it looks for the
oldest stale issue. Then it researches
the issue using a langchain v uh you
know link chain v1 agent and so that
researcher can look at lots of things.
It can search issues, PR, search the
repo, anything it needs to look at to
figure out whether the issue is still a
valid issue.
Then another LLM will decide what the
correct action should be based off the
results of the research. So it'll say,
"Oh, okay. I think we should close the
issue, open the issue, we should assign
the issue, right? It can it can propose
any of those." And at that point, we
send the proposal to a human. That's me.
uh and then I can review it uh and just
say good or bad or I can actually edit
it and say like okay I like this part of
your proposal but not this part and uh
and then you know we make the change
happen. So I've got this running
um I have this running locally here.
This is using my um you know my GitHub
uh API token since it has to do things
on my behalf. Uh, so let me let me open
up a new window so I can show you
uh it working.
[Music]
Uh, let me see. How would we open a new
one
here? I'll just restart. I'm going to
restart this agent here. Okay,
there it is. All right.
So that opens up this Langmith Studio
UI. And this looks really similar to to
DebUI. There's a lot of similarity here.
It's it's rendering my graph. This is my
Lang graph graph. It's a very simple
graph. It's it's very linear. Um the
important thing is that at a certain
point here, there is a human interrupt
where the graph pauses and waits for the
human to make a decision. Right? So I'm
just going to tell it to go and research
a new issue. So I I just click submit.
So the first thing it does is select the
issue. That's just an API call. That's
very fast. So it's got the issue and uh
this is the issue that it's selected
from the repo. And then it starts
researching the issue. And so the
research step is pretty it takes a fair
amount of time because it can do quite a
few calls. It's got a lot of tools it
can use here. I'll I'll show you the
tools it can use. Um, so here it can
search issues, code, pull request, it
can get a pull request, it can get an
issue, it can fetch a whole file from
the repo, and it can list all the files
from the repo. So these I' I'm using the
GitHub API in order to define all these
tools and I'm giving all these tools to
the agent that I can use uh in order to
um in order to research this original um
issue here, right?
uh and this is the actual issue that it
is researching
and deciding what to do about. So in
this linksmith view here, uh I can watch
all the calls go out to the LM as
they're going out just to, you know, get
a feel for what it's doing, right? So
it's it's calling search code, it's
calling search code, it's calling fetch
file, it's calling search code, and it's
going to make quite a few calls. And I
remember one of the questions we had
yesterday was about, you know, like how
to do rate limiting on your tool
calling. And that was something that I I
did deal with have to grapple with for
this agent because sometimes it just
wants to research forever and I'm like
hey listen like you've gone too far
let's stop researching. Uh so let me
show you how I um did that. So I wrote
this middleware. Uh this is a concept
from uh lingch chain is that you can
have uh add a middleware to your agents.
So this is my tool call limit
middleware. uh it checks to see the
current number of tool calls and it
checks to see if they're over a limit
that I've configured and then actually
adds a message that says hey no more
tool calls can be made now it's time to
summarize what you've learned you know
following the format and so I remove the
tools I say hey you got no more tools
and it's time to summarize right so this
is how I managed to uh you know
basically limit this agent and cut it
off after a certain amount of time is by
putting in this middleware that detects
the number of tools and decides when
we've gone over the limit right uh here
you can actually see all the tools lots
of tools tools tools tools they all use
the GitHub API there uh and let me show
you how I configured it right so this is
where I configured the agent so we have
the prompt the model the tools and then
we have the middleware which limits the
number of tools
all right so we can check back and see
how it's going so it's uh it's done its
research And at the end it makes a
proposal
and that proposal is using structured
outputs and that raises a human
interrupt. That's what they call it in
in langraph. Let me see if I can find
human interrupt. Right? So we've raised
a human interrupt which pauses the graph
at that point. And so then that gets
sent to this agent inbox that I have
running here. So this agent inbox, the
idea of this agent inbox, this is from
Langchain. The idea here is to show all
the paused uh you know workflows like
the paused agents that need interaction
from a human, right? Um so I can look at
this one from earlier today and um you
know uh see what it says. So it says
okay yes we should close the issue.
Let's add this label. Let's remove the
stale label. We're not going to assign
it to co-pilot. Here is the suggested
comment. This is the one that I often
will edit. Um,
uh,
and I'm just going to edit it like this.
Okay. So, then what I can do is submit
and and then it'll actually go out and
close the issue. I didn't I Okay, I'll
I'll open the issue next time so you can
actually see it. Close it. All right.
So, this is the one it just looked at.
Um, so this is the actual issue and it
says we should not close this issue. And
it says um
that we should have an explicit note
about the reopen in container issue. Uh
um
I think
I I'm I'm just gonna I'm gonna close it.
I think we can close it. It's uh I'm
going to close as this hasn't come up
recently and um it's unclear that this
is needed. Okay. So, I'm going to say
close it. Close it. Close it. And I'm
not going to assign it to co-pilot. Uh
but we could have assigned it to
Copilot, too. Um but I'm going to close
it. And
uh and then it it uh there you go.
There's the comment that went through.
Right? So, that's the idea is that, you
know, this this agent, it just posted on
my behalf, but, you know, I approved
what it said. I was okay with what it
said and uh, you know, I let it through,
right? So, this is one way that you can
handle having a human in the loop is to
actually have a user interface that, you
know, renders the the, you know, the
pending paused uh, actions from an agent
and lets humans decide what they're
going to do with that. Right? Now if
you're going to do that you need some
sort of way to persist the the agent
state right so for lang chain we're
persisting thanks to lang smith if
you're using agent framework you would
need to persist with some other sort of
mechanism right uh and so you have to
kind of think through that so hopefully
we'll continue coming up with more
examples of different ways that you can
involve humans and different interfaces
uh for what that could look like right
because it depends on whether the human
is there while it's happening or if
you've got like a background process
that's just processing these and the
human at some later point can wants to
look through and decide what to do.
All right, so let's look um at a couple
other features that are commonly
associated with agents planning and
memory.
So first let's talk about planning. A
lot of times people think of agents
specifically in terms of planning. Now
you could argue that the agents we saw
so far were basically doing planning,
right? Because by virtue, if you give an
agent a goal and you give it a bunch of
tools, the LLM itself is basically doing
some sort of planning, right? Because it
said it it realizes like, oh, I need to
call this tool first and this tool for
next, right? But it's not doing explicit
planning. So some some people will
actually add explicit planning steps to
the agent uh and planning and actually
give the agent planning tools to really
force it to um to be more structured
about the approach it's taking for
something. So this can be particularly
helpful if the tasks are really long
tasks, right? Uh because then you're
like, you know, you're you're making
sure it's going to be thorough with uh
accomplishing that task. Uh and there's
two general ways of doing planning. One
would be a prompt bl based planning
where at the beginning you just have an
explicit uh first step that says okay
you're going to make a plan here's your
tools okay and really tell me exactly
what you think the best plan is right
and then the LLM can always refer back
to that plan that it wrote itself.
The other approach would be to actually
give it a tool uh like a to-dos tool
that says like okay you can just update
your to-dos right come up with all your
to-dos and then update them as you go
wrong along and each to-do would have
you know the content and whether you
know it status like whether it's done or
in progress or uh still pending right
and an example of this is GitHub copilot
agent mode in VS code does have a a uh
optional to-dos tool that you can use
right and you can see it here under the
GitHub copilot tools list And when you
do that, then GitHub copilot will
actually show you a bunch of tool to-dos
when it's working on a task. And this
can be particularly helpful for really
long tasks, like a big refactor of an
application right?
So what does this look like in Python?
So as an example, I'm demonstrating here
uh the magentic one orchestrator from
agent framework. This is a orchestration
agent that has a task ledger built in
and where it's constantly like checking
that task ledger deciding whether the
tasks are complete and deciding what
task to do next. Uh this came from
autogen. So if you knew auto autogen you
might have seen magentic one in autogen
and it came from Microsoft research
about magentic one. So you can read the
research about it. Um but that's
generally there is that it's going to
make up this this ledger and and decide
what to do next. So let's just go ahead
and run that. Uh
so that is a gentic framework magentic
one. Uh actually I'm going to run it
locally just because it'll be a little
faster to run with my
Azure model here. Let me give us some
space here. Gentic one. Let's start it
running. So this one is going to do um
plan a half-day trip to Costa Rica. And
it's got a agent that like has local
tips. It has a language agent that knows
how to like, you know, speak the
language of the country. And then it has
a travel summary agent. So what you can
see is that as it goes along, the first
thing it does is it says, "Okay, we have
this team of agents that can help us. We
have these facts and now you need to
make the plan." So this is the plan it
comes up with. It says, "Okay, we're
going to consult the local agent. We're
going to use the local agent. We're
gonna uh make this itinerary. We're
gonna send it to the travel summary
agent and then maybe optionally engage
the language agent to get some Spanish
phrases. Right? So this is the plan it
comes up with based off the agents that
we've passed to it.
Uh so then it starts working with
agents, right? So the orchestrator says,
"Okay, my first agent is, you know, is
the local agent. So I'm going to give it
this specific instruction." Right? So
this is like a supervisor agent that is
very specifically sending tasks to these
sub agents, right? So the local agent
says "Oh okay sure orchestrator.
Here's the information you want." Tada.
Uh, and then the orchestrator says,
"Great. We're next for we're ready for
the next step." And gives a specific uh
instruction for to the travel summary
agent. And then we got a travel plan
here, right? And it came up with a
pretty uh this time it was pretty fast.
Sometimes it's going to come up with a
longer plan. So that's one way um you
know that's that you can implement
planning uh in you know in your agents.
Now I want to talk uh briefly about
memory. Uh and when we talk about memory
like um people mean so many things. Uh
so I like the breakdown they have of
this in the lang chain docs and I link
to it here if you want to read how they
talk about it. So they contrast
short-term memory to long-term memory.
So short-term memory you can think of as
being the conversation itself, right?
Anything that happens in the
conversations, messages, tool calls, any
sort of state you want to update in the
conversation. And that's something you
can basically like pass through an
agent, right? Like by default, the agent
frameworks are going to remember
messages and tool calls. But if you
wanted to add anything additionally,
usually there's some way to, you know,
weave something through a conversation.
Now, the interesting about short-term
memory is that if a conversation gets
quite long, then that can overwhelm the
context window of an LLM and then you
may need to do you may you may need to
truncate the messages or summarize. The
more common technique these days is
summarization to make sure you don't
lose anything like really really
important. I used to do truncation but
then realized I was probably you know
missing some things at important at the
beginning or end. So uh most common
technique is to detect basically add a
middleware that says oh we're over the
context window let's summarize and just
insert the summary into uh into the
message thread. Uh and if you're using
lang chain v1 there's actually a
summarization middleware that will do
that for you. Right. So with short-term
memory typically you can just keep that
entirely in in memory. you only need to
persist it if you have a UI that lets
people uh like view their previous
conversations and then pick up with that
previous conversation. So chat GBT does
actually let you pick up on a previous
conversation. So they do need to
actually persist the conversations to a
database so that they can restore it. So
you need to think about whether whether
you need to persist conversations like
chat history to a database or not and
how that's going to work.
The other kind of memory is this like
long-term memory where you're like like
letting agents learn little like nuggets
of information about you and then
storing that in a memory store usually
tied to a user and so that you
definitely need a persistent database
and you need some way of realizing that
there's something important that should
be saved and that's quite tricky to do
well. chat GBT uh tries to do it and
lets you even inspect the memory that it
has. And it's it's pretty interesting to
see what it what it stores there and um
and what it's thinks it's memory, but
basically you'd have to make a tool
that's like a save memory tool that the
agent could use when it decides that
there's something you know relevant to
save. Uh so that's that's hard to do
well and you know there's libraries like
mem zero that you know only do that. So,
I don't have examples of either of those
yet to be honest. You know, I haven't
worked a lot with with long-term memory.
Um, but that's just to give you an idea
of what you want to think about in terms
of memory.
All right, we have one minute left and
uh we unfortunately cannot go over today
um both because there's another live
stream right after this one and also I
am going to San Francisco right after
this for the PyTorch conference which
will bring together lots of Python and
AI peoples. Uh that's that'll be
exciting. we're sponsoring that. Uh so,
uh I tried to answer some of the
questions as we went along. Uh I know I
didn't get to get to all of them. If you
have additional questions, um what I
recommend is actually posting it in the
resources thread um that we can link to
here. This is the discussion thread that
has uh you know, has the top it has all
the slides and recordings and code
samples. Uh but then you can also ask
questions here. So, if there's a
question I didn't get to that you're
wondering about, post it here and I
will, you know, I'll get an email in my
inbox. Uh, and it's the inbox I actually
watch. So, uh, it's a good place to
post. Uh, we do have office hours on
Tuesdays in Discord. Uh, so we already
had one yesterday. Uh, we'll keep having
them after the series ends. So, we will
have another office hours next next
Tuesday. So, keep coming to those office
hours on Tuesdays and we can keep
talking about everything going on in the
space. So that's another place to bring
questions
and uh and then of course we have our
final session tomorrow about MCP. So we
hope that you'll all join us in order to
dive into MCP. So there we're going to
see how to write our own MCP servers and
also how to point agents like the ones
we built today at MCP servers so that
they can consume tools from MCP servers.
So lots of machine ways that we can use
MCP and hopefully you can join us for
that.
All right, thank you everyone. I do need
to go now. Thank you so much for all the
great comments in the chat, great
questions. As always, we'll see you next
time. Bye.
Thank you all for joining and thanks
again to our speakers.
This session is part of a series. to
register for future shows and watch past
episodes on demand. You can follow the
link on the screen or in the chat.
We're always looking to improve our
sessions and your experience. If you
have any feedback for us, we would love
to hear what you have to say. You can
find that link on the screen or in the
chat. And we'll see you at the next one.
[Music]
Hey,
[Music]
hey,
hey.
Hey,
[Music]
hey hey.
[Music]
For the penultimate session of our Python + AI series, we're building AI agents! We'll use Python AI agent frameworks like the new agent-framework from Microsoft, and the popular Langchain framework. Our agents will start simple and then ramp up in complexity, demonstrating different architectures like hand-offs, round-robin, supervisor, graphs, and ReAct. If you'd like to follow along with the live examples, make sure you've got a GitHub account. 📌 This session is a part of a series. Learn more here: https://aka.ms/PythonAI/2 Chapters: 00:06 - Welcome and Housekeeping with Anna 01:12 - Series Recap and What’s Ahead 02:08 - What Are AI Agents? 04:47 - Defining Agents: Tools, Goals, and Loops 06:42 - Overview of Python Agent Frameworks 09:26 - Demo: Building a Simple Agent with One Tool 15:04 - Demo: Multi-Tool Agent and Tool Selection Logic 19:19 - Supervisor Agent Architecture Explained 22:35 - Demo: Supervisor Agent with Sub-Agents 26:31 - Comparing Agent Frameworks: Microsoft, LangChain, Pydantic 33:04 - Building Agentic Workflows and Graphs 36:01 - Demo: Workflow with Writer, Reviewer, Editor, and Publisher Agents 42:46 - Human-in-the-Loop with LangChain and GitHub Issues 51:28 - Planning and Memory in Agent Design 59:02 - Final Thoughts and What’s Next: MCP #MicrosoftReactor #learnconnectbuild [eventID:26299]