Loading video player...
AI agents are popping up everywhere, from smart assistants to autonomous research tools to
self-driving cars. But what actually makes these things tick? In this video, we're gonna break
down the anatomy of an AI agent. We'll peel back the layers, sensing, thinking and acting, to show
how data from the real world turns into decisions and then is translated back out as actions. So now
let's take a look at how all of these things work together So we'll start with the sensing part of
the agent. The agent has to get some information in. This is basically its perception. Just like a
person has eyes and ears and we perceive through those senses. Well, if we're talking about an AI
agent, how does it get information in? Well, one of the ways that it gets information in is through
text. It could be, ah, natural language processing. If we're talking about a chatbot, that information
just gets typed in and it takes that into its processing. It could also be some sort of sensor.
So, in this case, the there could be a vision sensor, a camera. It could be a microphone,
something along those lines that brings in information from the outside world. It could be
APIs that are or other types of events that are being triggered that are input into this system
as well. So these are just a few examples of the inputs that go into the system. So then it moves
over to a thinking stage. How do I process all of this. Well, it turns out in doing that I need some
more context. So, one of the things I'm gonna add to this system is a knowledge base of some
sort. In that knowledge base, I'm gonna store things like facts, things that are important to
this system that it needs to know, rules that it needs to operate with. Ah. It could also have some
other information that gives it context. So these kinds of things are gonna go into the thinking
process. Other sorts of information that could be important. And by the way these could come from a
database. Ah. These could come from ah a RAG source or retrieval augmented generation source. So there
could be a lot of different sources for this knowledge coming into the system. Another source
that we need to consider here is some sort of of policy information. So, we may have a situation
where, ah, I have goals that I need to consider. What is it I want the system to be able to
do? Ah. Particular objectives ah and things of that sort. I may have priorities that I need it to
consider. All of those things go into the thought process as well. We don't want it to make
decisions without considering the facts, the rules, the goals, objectives, the priorities, all of that
kind of stuff. Then we get into the reasoning part of all of this. Here is where we're gonna start
dealing with things like "if then else" kind of logic. So, here we're
processing all of that information, thinking about it, doing planning. So we're looking at what do I
need to do and how am I gonna go about doing it. And in these cases, I'm going to also need
to decompose tasks. So, if I know that I have a big, high-level goal that I want to accomplish, well,
one of the things I have to do is break that down into smaller components. So I need to do task
decomposition. I'm also going to leverage things like machine learning where the system is
learning through reinforcement. Or it could be learning through showing it lots of different
things. And it develops a pattern. So it sees these things. We keep showing it more and more of the
same type of data, more and more of certain types of objects if we're using a sensor. And then it
starts to develop an idea as to what those things are. What are the characteristics that go with
that? Then we use something like a large-language-model technology and leverage that again for some
of these things, like the text inputs and things of that sort, chain of thought, reasoning, all of
this. So here's the thinking part, the reasoning part of this system. Next, we're gonna move to
the generative part, the action. Here we're gonna generate certain types of output. We could
generate text as an output. We could generate speech. We could generate alerts. We could generate
video, all kinds of things like that. The action might also be to read or write to a
database. So, this is a possibility as well. Um. We may also execute some some
level of control. So maybe we have actuators because we're wanting to operate on the real
world. Maybe I have some sort of robotic system or a self-driving car where all of this. Once these
decisions have been made, now I'm gonna operate on the system. So, maybe in the case of a robotic
car, ah are, or a self-driving car, I've taken in information from sensors. I've considered the
facts. I've considered the goals. I've run it through my reasoning logic, and then I interface
through an actuator in order to affect the way that the system actually works. And another really
important part of all of this is a feedback loop. So I wanna make sure that it's constantly
evaluating its own own performance. We want to evaluate the outputs of the system
and make sure that they're achieving, in fact, what we intended. Do they match the goals that we had
in mind? The the term that we use here is reinforcement learning with human feedback. So this
is basically where we give it a thumbs up or thumbs down. Sometimes the system is is getting
its own feedback by trying an action and then seeing did that get us closer to the goal or did
it take it further away from the goal? So we can do some of these kinds of course, corrections on
its own, or we can override and create the course corrections ourself. This is the basic anatomy of
an AI agent. Okay. Now let's take this anatomy that we just described in the abstract and take a real
example. So let's say I want to book travel reservations for a trip that's upcoming. So what
do I need to put into the system? What does it need to start with? Well, it needs to know what
dates of travel, you know, when am I going and when am I coming back? It needs to know the destination.
Where am I going? So, there might be other things, but we'll take those as as inputs. So that's the the
sensory perception part of this. And I would probably enter that into a chatbot. Or maybe it's
smart enough to read it off of my calendar and see that. That goes into the reasoning portion. But the
reasoning portion, again, needs more context. So we're gonna go up here to the knowledge base.
In this knowledge base I would have already stored some preferences that I have. Maybe there's
certain airlines that I prefer, maybe certain hotels that I like to stay in. Um, it
could also be based upon the location, ah, you know, where maybe I like a particular hotel chain, but
maybe it's gonna depend on the particular city which one of those hotels I'm gonna actually
have it choose. So where is the event that I'm going to be going to speak at? For instance, if I'm
speaking at a conference? Well, then I'd like the hotel maybe to be close to that. But, I also
run every day. So I want to have a hotel that's in a place where I can go for a decent run. So, that's
another kind of preference that would be built into the system would be very personal to me. Now,
some other information that wouldn't be personal to me would be, ah, things like maps. Um. So the system
should know where all these different things are located. It needs to know prices for the different
things that it might book. It needs to know availability for flights and hotels and things of
that sort. So all of this knowledge base it could look up and that's going to be important to feed
into the decision-making logic. Another thing we have to consider hey look, I'm traveling on
business, so I have to follow IBM's business gone ah guidelines and do what is going with the policy.
So, they're gonna have some limits or some caps on. In the particular city, you can spend this much
on a hotel, but not more. Or here is a particular preferred travel partner and we want you, ah,
to book with those it If that's what we're considering. So these, meh, would be policy issues that
are also added in to the decision making process. Once I've gone through all of that, then the
reasoning goes through and it looks at all of the things that are here, and it then figures out what
is the best way to satisfy this request? And ultimately it's gonna go out to the action
portion, which is going to book the reservation. And it's going to book this by going off and
talking to the airline reservation system, the hotel reservation system and a number of other
different things. And after it's done all those things, it's gonna give me the input. I'm
gonna have the electronic ticket and the reservation and all of this kind of stuff. So this is it
acting on the in the external world and accomplishing the task ultimately. Okay. So all of
this is great, I think. But we need to go back and talk about the feedback loop. So, it's gonna ask
me to some sort of survey, for instance, after it's all done. How did I do? Did I did I meet
your needs or not? And I'm gonna give it a thumbs up or a thumbs down. So here's the
reinforcement, with the reinforcement learning, with human feedback type of thing. It could also
go back and evaluate some of these things on its own and say, well, I came up with this answer, but
I'm gonna double check myself and see how well did I match these things, maybe even try a couple
of other scenarios on my own as hypotheticals, and it could operate on that. And again, keep tuning
itself, keep getting better, keep getting smarter, keep getting more personalized and more effective,
and doing what is otherwise a relatively complicated task to accomplish. Now, I hope you not
only understand how AI agents work, but also how powerful they can be and the amazing potential
they possess to improve speed and efficiency, freeing you up to do the things you do best and
leaving the gorpy details to your AI agent.
Ready to become a certified watsonx AI Assistant Engineer? Register now and use code IBMTechYT20 for 20% off of your exam → https://ibm.biz/BdbbXk Learn more about AI Agents here → https://ibm.biz/BdbbXt What makes AI agents think, plan, and act? 🤖 Jeff Crume breaks down how LLMs, RAG, and generative AI form the brain of intelligent systems. 🔍 Explore how these agents learn, reason, and evolve. Discover the real anatomy of AI innovation! AI news moves fast. Sign up for a monthly newsletter for AI updates from IBM → https://ibm.biz/BdbLgp #aiagents #llm #retrievalaugmentedgeneration #generativeai