Loading video player...
Hey everyone, thank you for joining us
for the December session of the AI apps
and agent dev days series. My name is
Anna. I'll be your producer for this
session. I'm an event planner for
Reactor joining you from Redmond,
Washington.
Before we start, I do have some quick
housekeeping. Please take a moment to
read our code of conduct.
We seek to provide a respectful
environment for both our audience and
presenters.
While we absolutely encourage engagement
in the chat, we ask that you please be
mindful of your commentary, remain
professional and on topic.
Keep an eye on that chat. We'll be
dropping helpful links and checking for
questions for our presenters to answer
live.
Our session is being recorded. It will
be available to view on demand right
here on the Reactor channel.
With that, I'd love to turn it over to
our speakers for today. Thanks so much
for joining.
>> All right. Thank you very much, Anna.
Welcome everyone to the third episode of
the AI apps and agents dev day series.
Um, got lots of really cool stuff we're
going to be looking at today. Really
focused on the scaling and orchestration
of AI agents. So before we get started,
uh my name is Stephen McCulla and I'm an
AI solutions architect with NVIDIA. So
that means that I get to work with
Microsoft on implementing all of the
latest and greatest AI technology into
Azure. Um and I am joined today by
Anthony Shaw.
>> Hi everybody. Uh yeah, my name is
Anthony. Great to be here again and and
yeah, present some stuff with you this
morning, Stephen. I'm calling in from
Sydney in Australia and yeah I lead the
Python advocacy team at Microsoft.
>> Yeah, thank you Anthony. So a bit about
this program u Microsoft and Nvidia are
super close partners and they work
together to power the next generation of
AI through deep integration of our
technology into Azure. Um so this
webinar series is part of that
integration. We want to show users like
you how you can best leverage all of
this amazing technology that's coming
out. Um, so definitely make sure that
you tune in to the series and explore
all of these great integrations that we
have to show. So without further ado,
let's go jump into it. So today we'll be
covering a couple different things. Um,
the first is sort of a recap on exactly
what AI agents are and how they work. Of
course, that's going to be sort of the
core topic of this episode. So, we want
to make sure we're going into this with
a robust and refreshed understanding of
it. Um, next we'll go into Microsoft
Agent Framework, which is essentially um
a tool you can use to orchestrate and
tune and build your agent uh agentic
workflows
um on Microsoft's tool and on Azure.
[snorts] Uh, next we'll go into Nvidia
AI blueprints, which are essentially
reference recipes and architectures that
you can deploy yourself or you can use
to customize and fine-tune for your own
purposes. I'll show you a custom example
of the AI model distillation blueprint
from NVIDIA. And then Anthony is going
to take us through some real world
integration examples showing how you can
integrate your agentic workflows into
your app in into your uh customerf
facing apps as well as background batch
processing.
All right. So
first thing AI agents right let's have a
quick recap. So what are they? How do
they work? Right. Well, I I imagine that
if you're attending this webinar, you
are at least a little bit familiar with
large language models or LLMs like
ChatGpt or or Neotron or DeepSeek. U but
what exactly is an agent, right? Where's
the difference between LLMs and agents?
Well, you can think of an agent as an
LLM plus memory plus reasoning and
tools. Um many LLMs nowadays have native
reasoning and tool calling support built
in. So once we add that memory layer on
top um that's managed by agentic
software
um like Microsoft agent framework we can
turn that LLM into an agent. So you if
an agent is essentially an LLM that has
tool calling, reasoning capabilities and
memory capabilities. So those three real
functional capabilities that we add into
the LLM are open up a whole world of
possibilities um and allow us to do
amazing things with our LLMs and with
our agents. Um so from a high level
right what can this agent do um with the
tool calling essentially it's calling um
external code so you you can pass in
let's say a Python function that updates
your LinkedIn or or a you know
JavaScript function that will um create
a web page for you. Um the possibilities
for the tool calling within agents is
you know really like your imagination is
the limit. Um so there's lots of amazing
things that you can do. Um, and whenever
we think about the reasoning, right,
that's essentially the recursive
prompting that's happening inside the
LLM that allows the LLM to become more
intelligent and have much more
intelligent uh, reasoned outputs. So,
um, agents are capable of incredible
things whenever we add in this reasoning
and tool calling capability. So whenever
we think about agentic workflows, that's
essentially putting multiple agents
together to work to combine together and
accomplish some specific task that we
assign it. So a common question um that
we get is you know why would we do why
would we build this agentic workflow and
introduce this complexity instead of
having let's say some uber intelligent
agent some uber intelligent single model
to do all of this work right um well the
reason for that is is specialization
um there's really not a single AI model
out there that is the best at
everything, right? We haven't hit
artificial super intelligence just yet.
Um, so we need these specialized
fine-tuned models to accomplish
individual tasks that they were really
trained for. Um, so for example, some AI
models like Neotron parse are really
specialized for document analysis. Um,
some models like GPT5 or uh Neotron are
more uh chat agents with coding
capabilities. Um, some models are great
for audio processing. Um, so, you know,
most of us probably have experience with
these chat models, but there's really a
whole world of different LLMs out there
that are meant to accomplish particular
tasks. So whenever we use them together
we can really get the best of all
worlds. We can get the best document
processing capability, the best um audio
visual capabilities, the best chat
capabilities all together into one
system. Um so for example you can think
about let's say an AI system that might
be used by doctors during a medical
diagnosis. Um if that doctor brings in
let's say their phone connected to this
backend AI system um that backend AI
system needs to have the capability to
um transcribe and understand whatever
the doctor and the patients are saying
in this diagnosis room. Um, that
workflow also needs to uh have the
ability to look up past medical history
of the patient and it also needs to have
the ability to do deep research into the
symptoms that this patient is listing um
and associated uh diseases that this
patient might have. Um so all of those
capabilities are really important for
this hypothetical uh you know medical
agentic workflow system and rather than
having one you know super intelligent
model try and do all of these different
things at the same time. Um we see much
better results whenever we have an
agentic workflow comprising multiple
models all working together to achieve
that task. So now that we know about
agents and agentic workflows, um the big
question is okay, how do we how do we
run these agents? How do we build these
workflows um in a effective efficient
way? And that's where Anthony is going
to jump in and tell us about Microsoft
Agent Framework. Take it away.
>> Thanks, Stephen. Yeah. So, Microsoft
agent framework is something that we
released in preview a couple of months
ago
um and at the Ignite conference uh just
last month. Um there is several
workshops and showcasing of the
Microsoft agent framework. So, agent
framework um supersedes two other
projects, one called autogen and one
called semantic kernel. um is designed
as an open agent framework that works on
the Microsoft stack and also others as
well and I'll cover that in a bit. Um
the main thing is the agent framework is
a uh codebased
um agent builder. So you know you would
either use Python or C.NET to develop
these agents. Um and I guess the
question is like there are drag and drop
editors for agents. So why would you
write it in code? I think the main thing
is that you've got absolute control uh
and the ability to customize everything
in the agent. So, you know, like Stephen
says, an agent basically is a
combination of a uh an LLM like a model
with memory tool calling and reasoning.
Um but a lot of the time as well, you
want to integrate other things like
you've got MCP calls, you've got
specialized APIs you want to call into.
um you want to kind of have a lot of
control as to what comes in and what
comes out of the um each individual
agent in your workflow. So agent
framework gives you that ability because
you you're writing Python so you have
full control. Um it's a open source
product as well. So that is MIT
licensed. Um so for any companies who
you know are concerned about having
proprietary code or anything like that
we we have open license step. Um another
great thing about it which I'll
demonstrate in at the end of the slides
is that you can run it locally and you
can experiment locally. Um so you can
even run it with local models. So if
you're running um uh some models locally
on your machine um or you want to
experiment with something like GitHub
models which is free uh you can do that
and then when you're happy with how the
agent works um you can get it running
running on uh on Microsoft Foundry. And
then the other thing as well is that it
is an open framework. Um obviously it
does support uh Azure and Foundry but it
actually supports other clouds as well
and I'll demonstrate that uh later. Next
slide.
So agent framework is is one option. Um
but we have several different ways of
building agents on on our platform. Um
if you have your own framework or you've
got another one that you prefer to use
and you just want to run those
applications on Azure. Uh we have
infrastructure as a service. So you can
run that as either a container which is
kind of the [clears throat] modern way
of doing it. And the way I would
recommend is that the agents are run in
container environments and you would
deploy those onto something like Azure
Kubernetes service or Azure container
apps. Then that would be the option. So
you can basically take any open-source
uh agent orchestration framework and you
can run that on Azure. Um we have full
support for lang for example. If you go
on the lang chain documentation, there's
a Microsoft section in there that lists
how you can integrate Langchain into
every uh Azure service that would be
used for agents uh with Langchain. Um
any other tool and framework as well
would support um the models on Azure or
can be run on Azure. So the team that I
work on does a lot of work with those
open source projects to make sure that
we're well supported in the different
frameworks. Um if you prefer something
where um the the platform is doing more
of the management for you. So that's
sort of platform as a service. Um AI
foundry and agent service does that for
you. So you can just do a sort of click
ops based agent. So you're just giving
it a prompt giving it a model and you
just basically create an API endpoint
and call it from there. You don't need
to write any code. The pass option is
foundry and agent service. So you can
build agents just fully in foundry
without writing any code. Um, and you
can plug in knowledge sources, you can
implement basic memory patterns and
things like that. Um, and you can also
integrate tools with MTP servers as
well. So, yeah, I'd recommend checking
that out if you don't have a dev team.
Um, and then sort of our sort of fully
SAS option is Copilot Studio, whereas uh
you're really with Copilot Studio,
you're mostly looking at building agents
for like users within your company. Um
or if you're a uh a vendor that actually
sells that sort of thing to other Office
365 uh users, you can use Copilot Studio
to do that. So you're building an agent
that's used for uh like employees and
internal users um to do specific tools
and capabilities.
And next slide.
So agent framework is our open source
engine for building and orchestrating
intelligent AI agents. Uh like I
mentioned it is built on open standards.
Um it's designed to be interoperable
with uh different models, different uh
storage for memory, different backends.
Uh we're doing a lot of work to make
sure the agent framework is something
that doesn't have any lock in. Um you're
really kind of building on an open base.
uh and then you can integrate that into
any different model, any different cloud
and any different backend.
Um we also have a team at Microsoft
research that's looking into like the
cutting edge ways of running agents and
feeding some of those insights into
agent framework. So anyone who's
interested in AI research and a dentic
research uh you'll see a lot of papers
coming out from the Microsoft research
team and some of those ideas and
technologies are being built into agent
framework. Uh also it's ready for
production which is something I'll
explain what that means later. Um but
you know just because the scale of
Microsoft and the types of customers
that we have you know have very high
expectations as to what uh production
application looks like. Um, we put a lot
of thought into things like security,
into telemetry, into deployment models,
how to make it resilient, things like
that. So, yeah, next slide.
Cool. So, if you're building, so for a
single agent, an agent has capabilities.
Uh, it has, I think, sorry, skipped a
slider.
>> Yeah, sorry about that. That was my bad,
Anthony. There you go.
>> Okay, cool. So a single agent has a
model, memory, tool capabilities. Um,
and often like Stephen said, you want it
to have like a single purpose. So you
basically drawing boundaries. This agent
does this task and it specializes in
that specifically. Um, with agent
framework, you can build different types
of workflows. Uh, where I would
recommend everybody starts is a
sequential workflow where one agent
hands off to the next. So you have a
basic agent which has a purpose. You
give it that purpose in its prompt. Uh
you give it the tools that can only help
it do that one thing, nothing else. Um
and then once it completes its tasks, it
gives its output to the next agent. So
when you're designing these things, um
if you think you need multiple agent
workflow, and often actually you
possibly don't, um modern models
can handle multiple things at once. uh
you can give them 20 different tools and
you can give them a fairly complicated
list of instructions and they can call
those tools and just complete a single
task as a single agent. Uh if you feel
like you definitely want more control
over what each agent can and can't do,
uh you can put them in a sequence as a
sequential workflow. Uh because agents
can take a few seconds, sometimes a few
minutes to complete depending on how big
the task is. Um often you also want to
have that concurrent. So just instead of
them running as a sequence, you can kind
of initialize a a workflow, send it to
multiple uh agents to all work on the
same thing at the same time and then you
can consolidate those results back. We
call that a fan in fan out pattern and I
I'll give you one example of that later.
We also have handoff where the agents
can actually um you fan out to different
agents but they can also send
information to each other. And then the
ones after that, the bottom three in
this graph are really more advanced. And
I'd recommend only exploring those once
you've done the top three. [laughter]
I've met a lot of people that have got
really excited about the idea of having
uh multiple agents where they're all
talking to each other and coordinating
and things like that. Like in in theory
sounds like a brilliant idea. And then
when you actually test it out in
practice, um they can spend, you know,
several minutes running and having
conversations with each other getting
confused. Um or you can use just a huge
amount of tokens um in the conversation.
And so I strongly recommend starting off
with a sequential workflow. If what you
think could be sped up a bit, then look
at a concurrent, so fan out or fan in.
and then really really only if the
problem that you have needs um like a
group chat or magentic pattern then
start to explore that. But I'd only
recommend doing that once you've
explored the other options first. I
wouldn't suggest going straight into
that um because you really need to have
a lot of knowledge about developing
prompts and how to um kind of refine the
prompts and the tools down so the agents
don't get stuck in an infinite loop.
Next,
next slide. Yeah.
So, yeah, an agent without tools is not
an agent. Um, it is really just another
another chat interface and you're just
talking with a model. So, the whole
point of this agent framework is that
you give it capabilities to actually go
and do things or retrieve data to make
decisions. And so what we've done with
agent framework is we've um enabled it
so it can talk to anything that uses one
of the open standards. Uh so A to A uh
agent to agent MCP or whilst it's not a
standard um the open the open a sorry I
could confuse the open AI spec uh is not
a standard but um it's actually becoming
like the the sort of standard API
interface for most models these days. Uh
so we support anything which has like an
open AI interface. Uh and then we also
support a huge number of different
backends for storage retrieval
um for function execution
um and even different clouds as well. So
Amazon Bedrock would probably be one
name that jumps out there. Uh yes, it is
a competitive platform um but we it's
something we can also integrate into. So
like I said before, we're not really
designing something that's designed for
lockin. We just want to make an open
framework um that works really well on
Azure but also can be used on other
platforms.
Next slide.
Yeah. So with agents um there's a lot of
information you can provide in the
context or in the prompt. Um, but
ideally, especially if it's a chat
agent, uh, and a user's had a
conversation with it and they start
another thread, they don't want to have
to remind that agent about things that
they previously talked about. Um, so
what we kind of have built into this is
the idea of a memory or a conversation
state. Uh, where we've got a process
which kind of like summarizes a
conversation and then stores it in a
memory pool and that memory pool can
then be retrieved and injected back into
the agent for another thread. So if you
have a conversation thread with it with
an agent which is the chat agent and you
talk about your plans what you were
working on and then you start another
thread with that agent uh it's basically
can have a longerterm memory where it's
remembered specific fact um and also
it's summarized previous conversations
so it's actually aware of decisions that
have been made before so you don't have
to explain everything all over again. Um
that pattern is an open uh
implementation. the example we're going
to use the code example um just uses it
in memory so it just stores the the
memory in memory so just in RAM um but
you could drop it in Reddus for example
you could drop it in a structured SQL
database that's just the standard API so
depending on where you want to put that
information obviously security is like a
massive concern there because if it's
got information that the user's given it
about uh what their plans are what
they're thinking out um whatever it is,
then you know you want to store that
somewhere
um where it's is secure and only
available to that specific user. And so
we offer that as a like an open spec
that you can implement. So if you got
your own proprietary database, you can
lock it into that. Um or we've got
example implementations with um with SQL
databases. Next slide.
Cool. Uh yeah and then lastly on agent
framework this is designed to be
enterprise ready. So when you run a
workflow in agent framework um you can
enable tracing all that tracing is then
available either locally you can see the
full traces in Debui um or if you
connected to application insights you
can see uh the full trace for every
conversation and every agent uh and you
can see exactly what decisions it made
and why which tools it called with which
parameters how long it took if there
were any errors. So we've got basically
got a full audit log of everything it
was doing. Uh that again is built on
open standards that's built on open
telemetry. So if you have a different
tracing provider that supports open
telemetry, you can also export it there
as well. Cool.
So, I'm going to give you a quick uh
demo before we jump in
um and how to build an agent. And to do
this, uh I'm going to use an extension
for VS Code called AI toolkit. Uh this
one is if you go and look in VS Code and
go to the extension store and search for
AI toolkit, you'll see this one here, AI
toolkit for Visual Studio Code. Uh,
that's the extension you want to
install.
Um, and once you've got that extension,
you'll see that there is a AI toolkit
menu. And once you open that up, you go
to agent builder. Agent builder is kind
of just a way of getting you started
building your first agent. So what you
do is you connect in the different
models. Uh so in my case I've got uh
I've connected to my GitHub account so I
can run any model that's on GitHub. I've
also got it connected to Foundry. So
I've got GPD5 mini running there. And
I've also got um five for reasoning
running on my laptop.
Um and I can actually run the models
directly on uh your machine. So if
you're running uh Foundry local, you can
run Foundry local models and just
develop the agent locally. You don't
need a uh cloud provided model. So once
you've done that, you give your agent a
name. You pick a model. Um it also works
with OAM. So if you can connect it to
any local model, you can do that. You
give it your instructions.
Um you would set up uh then any tools.
So tools are things that it would
connect into. Uh these could be custom
tools. So is a piece of code that you've
connected it to or an MCP server. So in
my case, I've got two MCP servers um
which are related to my application.
I've already got them running on my
machine. So I'm running them in here. Uh
this is my architecture. I've got a API
and I've got two MCP servers. One is the
finance MCP and the other one is the
supplier MCP. And these are connected to
proprietary databases. This is all
custom custom code but I these are MCP
servers that are running locally on my
machine and I can just design my agent
by giving it some instructions. Um and
then in the playground I can give it a
prompt and it will run the agent locally
and then it will if it does decide to
call MCP tools I can see which tool did
it call and with what parameters and
then what was the output. So for example
uh in this one we're analyzing the
performance of products in this
fictional store and I wanted to
understand the top selling products uh
for a specific store over the last few
weeks and I wanted to use the MCP tools
to do that. So basically, it's going to
retrieve some information. And so in my
conversation, I said, I'll look up store
number one uh for the last three months,
and then I want you to use the tools to
look at revenue, units sold, and skew.
Um, so it's going to use GPT5 mini to do
that. And I can see that it's actually
run the agent has called out to my MCP.
It's fetched some information about the
the products and it's given me back uh
that result as JSON. So I could ask it
for a text summary instead, but in my
particular case, I wanted this back as
structured data. So my agent's actually
retrieving information about um like
which are the highest performing
products uh that I that I want to use.
When I'm happy with the agent
uh in in the agent builder, I can go
down to the bottom and do view code. And
then I can pick Microsoft agent
framework and Python. And it will
generate um the Python code to basically
um run my agent. So um if I want to turn
this into an agent framework script, um
I can use this as a starting point. So
it's basically generated all the code I
would need. uh and I can look at this
and customize it and then start to
integrate that into my app. So yeah,
that's kind of how you would get started
with agent framework. You can either go
jump straight into the code um or what
I'd recommend is getting AI toolkit um
to connect it to some models, give it
some instructions, give it some tools,
and then just kind of experiment with
the prompts until you're happy with it.
>> Yeah. Back to you, Stephen.
>> All right. Thank you, Anthony. That was
awesome. Um yeah, so Anthony did a great
job showing us how we can create these
agentic workflows uh using Microsoft
agent framework. Um but if you're
anything like me, you know, the the
question is how do I get started, right?
Where do I look what what can I
reference if I want to build some sort
of agentic workflow for either my
personal experiments or for for my job?
Um, the answer to that is NVIDIA AI
blueprints. So, a these NVIDIA AI
blueprints are essentially reference
architectures, reference workflows that
you can deploy yourself. Um, and they're
all open source, so you can go in and
tweak and edit them to deploy them
however you would like. Um, so for
example, if I wanted to take the rag
blueprint and plug in my own models that
I wanted to, um, it really wouldn't be
difficult to do that. Um, I I'll show
you the GitHub repo in a second. Um, but
it's all open source, so very easy to
customize for your own deployments. And
the way that these blueprints work is
that uh there are three foundational
blueprints that Nvidia has created. The
first is agentic uh AIQ blueprint. The
second is the rag blueprint and the
third is the data flywheel blueprint.
And then on top of these blueprints,
we've created other blueprints that are
more specific to particular industries.
So, for example, a healthc care
blueprint that's focused on rag or a
dataf flywheel blueprint that's focused
on uh the financial services industry.
And if you go to um
build.envidia.com/bloopprints,
um I don't know if my screen is too
small to see here. I'll zoom in a bit.
Um but you'll be able to see all of the
different blueprints that we have
available. So for example um AI model
distillation for financial data um also
the um AI observability data flywheel
um
one that we'll be showing next January
as uh maybe this will be a bit of a
teaser is the Agentic AIQ blueprint that
I mentioned. So uh very powerful
blueprint. I recommend you check out
that episode when it comes in January.
Um, and yeah, we have lots of blueprints
for all different industries and use
cases. So, I mean, if you're looking to
develop and deploy your own agentic
workflow for whatever industry you're
in, there's a pretty good chance that
Nvidia has a blueprint that is either,
you know, maybe just what you're looking
for or pretty close. And since it's open
source and really made to be
configurable, you can go and configure
it and deploy it yourself. Um, so it's
very easy to get started here. So if I
go back to let's say this model
distillation for financial data, um, if
I want to view the source code on
GitHub, um, I can just go here and it
will take me to the Jupiter notebook
that contains all of the source code
associated with this workflow.
Um, and in this workflow, it has the
entire orchestration that you see here.
So, I'll zoom in a bit, but um,
yeah, it has this entire orchestration
all within this Jupyter notebook. So,
all of the databases, all of the, uh,
models, all of the nims, um, and the
surrounding orchestration software is
all built into this Jupyter notebook. So
really really easy to uh just get set up
and get going really quickly. And this
particular blueprint is focused on model
distillation. So essentially where
you're taking a larger more intelligent
model and using that to uh fine-tune a
smaller model uh for a particular
purpose. So
um let's say um we have a 50 billion
parameter model but and we want to train
a 1 billion or three billion parameter
model for a very specific task like uh
summarizing uh stock data on the public
stock market. Um that's a great example
uh for where you might use this uh
blueprint. So this uh notebook takes you
through the entire process of how you
can get up and running and you really
just you know click uh to um complete
all of these different code sections and
it just does the entire thing for you.
So it's really just awesome and it tells
you all of the requirements you would
need
for any individual blueprint. So, for
example, for this one, you would need
two Nvidia GPUs. Um, A100, H100, or if
you somehow have a B200, you know,
you're a very lucky person. Um, and so,
yeah, just two two of those GPUs and
then, um, you can go ahead and deploy
this workflow. So, uh, really easy to
get up and running with these
blueprints. Um, and it's just a great
way to like kind of get your feet um,
uh, or you know, get started, get your
hands dirty, get your foot in the door,
um, with a gentic orchestration.
So, let's take a look into, you know,
how these um, different blueprints work,
right? So, for example, I talked to you
about the um model distillation
workflow. And in this case, we're using
the Llama 3.370B,
of course, a 70 billion parameter model
to fine-tune these smaller models. So,
everything from a 1 billion parameter
model to 49 billion parameter. Um, and
they're being fine-tuned on uh financial
data. So everything from, you know, uh,
financial news that you might find in,
let's say, um, the Financial Times or
Yahoo Finance. Um, that's what it's
using to fine-tune this model.
Um, so let's look a bit more under the
hood at what's going on inside these
blueprints. So what you'll see a lot of
in these blueprints is Nvidia's own
open-source homegrown a LLM. Um, so the
latest one actually just came out
yesterday is the Neotron 3 Nano. Um, and
these are all under the Neotron AI
family. So the Ne Neotron 3 Nano is the
uh smaller member of the Neotron 3
family. Um, and it's a very powerful
LLM. um really seen great results um in
our internal benchmarks and great
results over the past two days after its
release. Uh we also have the Llama
Nemotron Super and Llama Nemotron Ultra
um that are just sort of going up that
chain of size as an and intelligence.
Um, also for vision language models,
information retrieval and content
safety. Like I mentioned earlier, um,
there's no one super like uber
intelligent model that can do all of
this stuff very well. Um, so Neotron
aims to really provide all of the models
that you would need um, in an in a fully
open-source method um, so you can run
the agentic workflows that you need.
So there's still the question of okay
now we have the model but how do we run
it? Um, and there's
something called Nvidia Nim, which is
essentially a Docker container that
comprises your model and different
observability microservices, different
optimization uh, inference optimization
microservices as well. And it's all
built into a single Docker container.
And it makes it really easy for your to
deploy and run an AI model. So, if I go
back to um my browser and I go to
catalog.ng.envidia.com,
it will take you to essentially like the
the home base for lots of NVIDIA
software. If you go on containers and
scroll down, you'll see NVIDIA NIM.
And from here, you'll see all of the
different NIMS that we have. So uh for
example there is the um the meta llama
270b
chat model um so that's a container like
a docker container with the metal llama
2 chat model baked inside and optimized
and included and baked in with other um
observability microservices.
Um, we also have I'll show you the
Neatron 3 one that just came out.
[snorts] So, if you want to go ahead
and, you know, try these models out for
yourself, this is a really great way to
get started. You really, I mean, you
just run like docker run and then the
container address here. Um, and it it
spins it up for you. The only thing
you'll need is an NGC account. Um, so if
you come back to this website and create
an account and get an API key and
include that in your Docker run command,
um, you'll be able to spin it up, you
know, very easily.
So, lots of great stuff going on. NVIDIA
really makes it easy for you to get
started and get up and running u with
your AI models and your agentic
workflows. So, now that we know how to
get up and running and um get set up
with our agents and our models and our
workflows, um Anthony is going to show
us how we can integrate these workflows
into real world userfacing applications.
>> Yeah, great. Thanks, Steven. So, I'm
going to kind of lead on from where we
finished before, which is that I showed
you how you can use the agent builder to
experiment building an agent. Um, then
what you can do is you can take the code
that it generated. In my case, it's
going to be Python. Um, but you can also
use C#.NET as well. Uh, and I basically
developed a few different agents. Um and
these agents are designed to provide
some like weekly insights to store
managers and this sort of fictional
store app that we have. Uh the code for
this application I'll share at the end
of the presentation as well. Uh if you
want to download this and have a look to
see how it works. So before that I'm
kind of building one agent here but I've
actually built sever several agents in
this particular case and I've given them
names. So we've got um one which looks
at the sort of weather forecast for that
region, one which looks at events uh and
one that's the one we were kind of
building before which is looking at top
selling products. So we've got those
three different agents. Then what we do
in agent framework is we have a workflow
builder and then we give it a name and a
description. Um and then we say what's
the starting point. So that's the data
collector. So that's the one you
initially give the instructions to. And
then we add the fan out edges. So we say
okay a data collector goes to a weather
an events and a top selling products.
And then we have fan in. So those things
come back. So these three agents then
consolidate back into an insight uh
synthesizer which is fun to say. Um but
the synthesizer agent basically is the
one that takes the results from all the
other agents and then turns it into like
a single response. Um once I've built
this agent then um I kind of just wrap
it in a Python function. Uh what you can
do in agent framework actually which is
pretty cool is you can run this thing
called devi um which lets you visualize
uh the workflows that you've made. So
the one I was just showing you before
the code for that you can actually
visualize that directly in a browser and
then you can also just run it directly.
So instead of having to write scripts
and stuff to test this out, you can just
iterate on it and run it in the browser
and then any events that happen like
whilst that's running um will pop up and
be shown on the right hand side. So you
get all the traces and you get um tool
calls and anything like that um would be
shown on the right. So what we have
basically with this weekly insights is
we want the ability to run the workflow
uh see what's happening uh and then get
that uh back directly in here as well.
Um something else we can do is in the
foundry agent uh we've got a local
visualizer. Um so instead of uh sorry in
the foundry extension there's a local
visualizer. So if you wanted to see this
information uh bit but in VS code you
can use that feature in the extension as
well. So what I've done is then taken
this this code uh this is the workflow
that we've built this workflow object in
Python and we can actually just run that
workflow from code directly. So sure we
can run it in um Dev UI and we can
experiment and stuff like that but I
actually want to integrate this into the
application because I think that's a lot
more realistic as like an example. Um
and when we're building this application
I kind of gave a glimpse before and
someone in the chat asked for some more
information on this. This is Aspire uh
Aspire 13 uh which was shipped uh yeah
October 31st um actually supports Python
as well now. So um what I've done is
basically defined my application
architecture in Aspire and I'm running
this all locally and so I've got uh an
API and the API then can talk to the
different MCP servers and I've got a web
front end as well. So my shop is running
as a as an app. So I can click on uh the
link which again the table I can click
on the front end and I can see the the
shop. Um, and this is actually running a
Python API. Um, I've got the agents and
the workflows that I defined before in
DevUI. I've imported those into my
application. Uh, and if I log in as like
a manager, then I can see and it can run
the inside. So, this workflow is now the
one that was running. That's not working
today. Uh, we were migrating some stuff
yesterday. That's why it's broken. Um,
but I can see all of that information.
And then in Aspire, uh, I can actually
go and get logs and stuff like that and
I can see what's happening. I can go and
look at that trace and understand
exactly why it broke. Um, so I use
Aspire because uh, this is just a really
nice way of me defining an application
that has multiple components. So I've
got the MCPs, I've got the API that
hosts all my agent framework application
um, workflows. Uh we've deployed a
production version of this as well which
is up here. So in our app I've logged in
as a store manager uh and it's actually
run that workflow uh that I was talking
about before and it's given me a nice
like integrated output. So this is an
example of an agent that's not a like a
typical chat agent where you've got to
say you've got to start conversation
with it and say please can you tell me
about important things happening this
week. The way we designed this one is
actually that the uh the weather, the
events, and the top selling products
would be run for a particular store. So
in this case, we've got um the pop-up
New York Times Square store. And so
we've run the workflow and said, "Okay,
I want you to only look at um
it's fiction. this store doesn't exist,
but in a fictional scenario, like are
there any major events happening in
Times Square this week that a store
manager might need to be aware of? So,
we're going to search the internet and
see if there's any like parades or
anything happening um that the store
manager might need to be aware of. Um is
it going to be really cold? Is it going
to be raining? Like, should we stock up
on umbrellas? Um like what other things
might be happening in that particular
store? And then also in that store like
what are the top selling products um
that the store manager might be aware
of. So like they can see that over the
last few weeks here are the top selling
products. We've sold 63 pullover fleces.
And the interesting thing is that when I
run the the workflow the output of this
is is not just text
because we can just we can get it to
output markdown. That's fine. Um, but
then the user's actually got to read
that and understand it. Um, and we're
just outputting long long paragraphs
into our interface. What we want to do
instead and what you can do with agent
framework is when you're building these
workflows. So the this synthesizer agent
is something that we've defined and in
code we've said okay it takes weather
events and product analysis like it
takes all this information and it
actually outputs this uh class that
we've defined called the weekly insights
class and this is a model uh and it
would say for this particular store
here's a summary here's the weather
here's the events and here's the stock
items and these actually a of uh
objects. So where previously you might
build an agent that gives you text
output and then the user's got to read
that information, what we do instead
with agent framework a lot of the time
is we define these uh classes, these
types um and then you have structured
information in there. So when the agent
runs, it actually gives you back a list
of stock items to the API. It gives you
back some specific actions that you can
take. It gives you back insights as a
list that you can render in the in the
portal. So in the UI here, like if I
wanted to make it so that this was
clickable, so I could click on that and
it would take you to the product, that's
really easy for me to do. And I don't
need to write all this kind of weird
code that needs to like pause the text
and kind of guess where the links are
and stuff like that. So with agent
framework, my strong recommendation when
you're building these agents is that you
think about whether it is you need
actually structured output. Um and where
AI agents are really capable is where
they take something which is
unstructured. So you've got like uh
paragraphs of text or you've got inputs
from users. It can take unstructured
information, it can make it structured
um and it can make decisions on that and
it can make suggestions on that and it
can summarize information and then you
can get that information back in a form
which is unstructured as well. So the
other option we have is we've got this
inventory agent which is something we've
built. Uh I'll show you what that looks
like in here.
Uh it's this one. This is a nice simple
one. Um yeah, it basically is a single
agent uh that is orchestrated by a
second one. And our suggestion here was
that we've got a number of items in our
store catalog that are low on stock. And
we wanted to help the store managers
basically figure out how do I restock
them. So this is one where we can launch
this particular agent and let me find
it.
Um, this particular agent has given been
given a set of instructions. So, we've
got a summarizer, but we also have one
that will go and fetch as much
information as it can about the stock
using MCP tools. Um, and once it's got
that, it will fetch the information. I'm
not having much luck today, am I? Uh, it
will fetch the information and to give
you back some recommendations as to
stock priorities and things like that.
So yeah, that's another one we
suggested. The code for this again is in
the repo. Um, and like I said before, my
suggestion is that you start off
building a simple agent in agent
builder. Um, once you're happy with
that, you turn that into code by going
to the bottom and doing uh V code. And
then once you've got the code snippet in
agent framework, you can then convert
that into a an actual workflow by
creating a workflow builder, deciding
which pattern you want to use from the
slides we had before. So either fanning
out and fanning in or having a
sequential sequence. Um and then for
each of the agents, you would then um
basically define a set of instructions
and the tools that it has. Um, and then
once you've done that, you can visualize
it in Dev UI. Uh, so that's the insights
one. You can visualize it in here, run
it in here, experiment with it until
you're happy. Uh, and then you can see
the events and traces and tools on the
uh, right hand side in Dei. And then
once you're happy with all of these
things um you can basically compose this
into a container uh that can be run
either in foundry or it can be run on
any kind of container compatible
platform. So um that is a uh feature of
the AI toolkit and the foundry extension
is that once you've built a workflow for
agent framework you can actually turn
this into like a deployable uh
application container uh and you can
have that running on production
infrastructure.
Cool. Back to you Stephen.
Back to the slides.
>> Yeah I mean I think that was um
that was the goal. That was it.
>> I think we uh Yeah, there's um
like amazing stuff that we have covered
and uh I mean it was all really
interesting, highly technical stuff. Um
and I think that the next step for
everyone in the audience is to go and
try to do it yourself. Um, so right here
I have a QR code that takes you to a
Microsoft blog which uh shows you how to
run an agent yourself using a Microsoft
Agent Framework. Um, so this is probably
the best way to get started. It's the
the sort of hello world of using
Microsoft Agent Framework and it's
really great. This is what I use to get
started uh to get familiar with Agent
Framework. Um, so I'll leave it up here
for a bit so you can um you can go check
it out for yourself. Um, there's lots to
explore in this world. I in this AI
world. Um, there's been a banner going
under the um screen um to join the
NVIDIA developer program. Um, yeah,
there it is. Uh, so definitely join that
as well. Uh, lots of cool resources um
through that program. So definitely
check it out. Um, and our next episode
coming up in January is focused on one
of our other NVIDIA AI blueprints. Um,
it's focused on our deep research
agentic blueprint. Um, really focused on
sort of like um fully automated rag
system that can handle all different
kinds of data and you can deploy it
super easily. Um, so it's definitely
something that you should check out. Uh,
whether it's applicable to your job or
maybe it's something you're interested
in personally, there's lots of value um
from from this upcoming episode. So
definitely check it out. Stay tuned. Um,
lots of stuff coming through this series
in the future. So yeah, thank you.
Anything else, Anthony?
>> No, that's all. Thanks. Yeah, thanks
very much everyone. Um, I'll drop a link
to the code I was demonstrating as well.
Um, if you're interested in how that was
put together.
>> Awesome. Tony Bologna. Love it.
>> That's my username.
>> Such a good username.
>> Too late to change it.
>> All right, everyone. Thank you so much
and take care.
>> Thanks, everybody.
Thank you all for joining and thanks
again to our speakers.
This session is part of a series. To
register for future shows and watch past
episodes on demand, you can follow the
link on the screen or in the chat.
We're always looking to improve our
sessions and your experience. If you
have any feedback for us, we would love
to hear what you have to say. You can
find that link on the screen or in the
chat and we'll see you at the next one.
>> [music]
>> Dix [music]
dick.
Dick [music]
dick
[music]
dick
dick
dick [music]
dick.
[music]
Data. [music]
[music]
Want to dict.
[music]
Explore how to leverage multi-agent systems in your applications to optimize operations, automate recommendations, and enhance customer experience. The solution utilizes Microsoft Agent Framework, OpenAI ChatKit and NVIDIA Nemotron model on Azure AI Foundry to seamlessly connect with store databases, integrate human oversight, and deploy scalable chat agents. This approach enables real-time analytics, predictive insights, and personalized interactions, resulting in improved decision-making, operational efficiency, and a superior user experience for application developers and users š This episode is a part of a series. Learn more: https://aka.ms/AIAgentsApps/y-MSFT 00:06 - Welcome & Housekeeping 01:04 - Introduction to AI Apps & Agents Dev Days 02:29 - Agenda Overview: Agents, Frameworks & Blueprints 04:55 - What Are AI Agents? Key Concepts Explained 07:10 - Why Agentic Workflows? Specialization vs. Single Model 09:18 - Building Agents with Microsoft Agent Framework 12:01 - Options for Agent Deployment: IaaS, PaaS & SaaS 16:07 - Workflow Patterns: Sequential, Concurrent & Advanced 19:22 - Tools & Memory in Agent Framework 23:00 - Enterprise-Ready Features: Telemetry & Security 23:51 - Demo: Building Your First Agent with AI Toolkit 28:25 - NVIDIA AI Blueprints: Getting Started 33:04 - Under the Hood: Models, NIM Containers & Deployment 38:35 - Real-World Integration: From Workflow to App 47:20 - Structured Output vs. Text Output in Agents 50:39 - How to Get Started & Resources 52:50 - Upcoming Episodes & Closing Remarks #microsoftreactor #learnconnectbuild [eventID:26557]