Loading video player...
So decision agents are an essential component in agentic AI if you're going to solve large,
complex problems. The challenge is that if you have a complex decision that you need to make
autonomously, and if you're building agentic AI, you're going to need decisions to be made
autonomously. Ah, but these decisions are not a great fit for large language models. And large language
models, of course, are the sort of key technology in the agentic AI, but they're not a good fit for
decision agents. So you need to build decision agents in your agentic framework, but you need to
use a technology other than large language models. So why aren't large language models a good
fit? Well, let's think about some of the things they are famous for. They're famous for
inconsistency. They might do the same thing every day and then suddenly one day do something
different. Well, that's not great. If you're trying to make a decision, you really need people to be
treated the same and not vary day to day, minute by minute, just because the LLM feels like doing
something different. They are notoriously black box. They are very bad at describing why they did
something. And it turns out often you need to explain to people why you made a certain decision,
why they didn't get the job, why they didn't get the loan. And so you need some transparency in all
of this. And large language models are not good at that, even when you ask them to explain themselves.
They have a little bit of a reputation for lying about how they decided what they decided. And then
the the final one is that in many business decisions, there's a there's history. You have a
database of information that tells you what you should do. What might be fraudulent might be
problematic. You need to be able to process that data and turn it into analytic insight. And large
language models are not very good at that either. So for all of these reasons, you're just not going
to use a large language model to build a decision agent. So what you can do is you're going to use a
decision platform or a business rules management system for this. And let's just sort of reiterate
some of the sort of key value propositions for this kind of technology. This is well-established
automation technology that's been in use for a long time. And so we know what the benefits are. So
let's think about what some of those benefits might be. So what are our requirements for a
decision that we're going to get out of using one of these platforms? So first and foremost, we're
going to get consistency. So if I use one of these platforms, it's going to make the same decision
the same way every time. It's going to give me complete control over exactly how the decision is
made, and once I've defined it, Every customer who gets that decision made against them is going to get
it made the same way. So I get ruthless consistency. The second thing they're really good
at is transparency. Not only do I have a formal definition of how this works and what the
steps, they're, the rules are that I'm following. I can explain that to someone. I can show that to
someone, and I can log it. So I can have a complete transparent log of exactly how this decision was
made. I have this transparency that I need for a decision. The third thing they give me is
agility. Now agility is important because the way I make a decision is subject to change without
notice. Competitors change their behavior. The market changes. The regulations change. There's a
court case. There's all sorts of drivers changing the behavior of the way you make a decision. And
if you can't do that quickly, if you have to wait for there to be new data or new documents, or you
have to retrain something that's going to take too long. So you have to be able to respond more
actively, more quickly. The other thing about decisions is that there's a tremendous amount of
domain knowledge in them. So programmers often find it really, really hard to correctly build
decision agents. And so, you really want to be able to engage people who have the domain knowledge.
That means you're going to need some kind of low-code environment. You're going to need some way to
engage customers, engage, sort of, experts, I should say, in managing the behavior of a decision
agent while still being able to manage it as a programing, uh, programmatic component in your agentic
AI framework. So you need some kind of low-code environment. And then lastly, you need this way of
embedding analytics that we were talking about, where I can take analytic insight, I can turn
historical data into analytic insight. And I can embed that analytic insight in my decision so
that I can make the decision more precise, more accurate, more analytically precise. Now, all of
these are sort of classic benefits of using a decision platform. But let's just reiterate why LLM
is a, is a, tough call in these things. If I have an LLM, well it's not really consistent. It's hard
to make an LLM do the thing. This is a feature, not a bug. That variation, that randomness is part of
what makes them so powerful. it's very hard for them to be consistent. Uh, they're definitely not
transparent, right? They're very opaque about how they did things. Even attempts to get them to
explain themselves are problematic. And if I go to a customer or a regulator and say, hey, I have this
black box that's been explained by this other black box, that doesn't really induce confidence.
They are, actually, they can be hard to change. Their behavior is, uh, you know, easy to get it set
up. You don't have to, like, code it. You just, you know, provide information to it. But it's then hard
to change without re, you know, presenting new data to it, retraining it with new data. You can't just
like, tell it to stop doing X, stop doing Y. If you've watched some of the news around attempts
to block particular agents or make agents behave in a certain way. If you try and like just code
something in quickly, you get very, very strange behavior. Um, they're quite complex. They require
quite AI-like level skills to build them and manage them. And as I said, they're just no good at
structured data. They're not good at building predictive models out of historical data, and
using that historical data to improve the precision of your decision-making. They're good at
reading documents and text. They're not good at structured data. So we're not going to use a large
language model. We're going to have to use something else. So what are we going to use
instead? What technology can we use to do it? So let's go back to the scenario I talked about in
the previous video, where I talked about a bank that needed to lend money. Can't write bank today.
Needed to lend money to a person. So I wanna lend you money. And to do that, I have an agentic AI framework
that manages that whole complexity. And as part of that, I have two agents. I have two decision agents. I
identify one was an eligibility agent to say, are you eligible for a loan? And then
another one which was to say, can I actually lend you the money? Which is sort of, uh, what banks call
origination? Uh, you want to borrow this amount of money for this actual thing? Can I lend it to you?
If so, what's the rate? What's the price of this? So I have these two decision agents. Now, we've been
building these kind of autonomous agents using decision technologies for a really long time. So,
um, there's a couple of things that need to be true of a decision agent. First of all, they need
to be stateless and side effect-free. So what does stateless mean? It means that you want them just
to respond to whatever data they're given at the moment they're given the data. Here's the data,
here's the decision. Here's the data, here's the decision. Don't remember the states. That's why we
had, if you remember, a workflow agent whose job it was to remember the state. So the workflow tracks
the state, and it gathers the data for us that we need, and it passes that back and forth to
these agents. So it says, okay, at this point in the process, I need I've got this set of data about
this person, about this application, about this loan. Are they eligible? Yes or no. And you get an
answer back. And similarly with the origination decision. So they're managing the state. They're
managing all of that. And that, uh, scales better. It, uh, keeps the decision agents simpler. Makes it
much easier to check that you're not using things like personal information or health information
inside the decision when you don't need to. So it's just a much cleaner interface. But why side
effect-free? Why is it important that your decision agents don't do anything, they just make
decisions? Well, you want to be able to reuse them. Let's think about eligibility. Well, I might be
using it in the context of a workflow for originating a loan. That's one use case for it. But
I might have other processes, other workflows that do other things that send you letters or that, you
know, um, tell a call center wrap or put you into a marketing campaign and so on. So I still need to
know if you're eligible, but I'm going to do something completely different if you are
eligible. And so by separating that, by not having the action be part of the decision agent, I get to
reuse it in lots of different circumstances. So I have these stateless, side effect-free agents. Okay.
So how do I build one of these? What does that look like? What technology do I need to build a
stateless, side effect-free decision agent that has these characteristics? Well, we use what's
called a business rules management system or decision platform. So decision platforms are
software stacks designed to build, you know, historically speaking, decision services that can
then be wrapped into decision agents. So what is a decision platform have? Well, it has a number of
software components. First and foremost, it's got a couple of editors. It's got typically like an IDE
or a technical editor and a low-code editor in which you can write logic, business rules, decision
logic. So you can lay out the the actual rules, the logic that has to be followed to make a
particular decision. And those two editors generally are then linked to a single repository. Now,
this might be something I get, but it might also be a more managed repository so that you can
have version control and branching and all those things that is specialized for business rules and
decision technology and available to your low-code editor. this varies by platform, but they
all have the concept of a repository in which you can do branching, the versioning and, and all the
kinds of repository things you need to do to make sure you have a current version of the rules. And
you can do development work and have multiple people working and all that good stuff. Now, once
it's in this repository, and because it's a decision platform focused only on decision-making
logic that is stateless and a side effect-free, you can do a lot more testing and validation of
the logic, so you can validate that the logic is correct. So you can have often a set of tools that
look at the rules that are in the repository and validate them. Is the logic complete? It's the
other. Are you missing a criteria that you're not checking? Do you have overlapping ranges, all that
kind of stuff? And it's much easier to check that in the context of a decision platform, because the
logic is written in a more declarative, less programmatic way, and it's managed as a set of
assets that can be checked. So you typically have a set of validation tools so that the logic you
write is more robust. And then obviously you're going to need to test it. Now testing, um, testing
tools can be as simple as the kind of JSON object. Pass it in, see if you get the result you're
expecting UI that you would use, like with swagger or something like that but there can also be a
lot more sophisticated. Some of the decision platforms have very robust test suites, where you
can load up very large numbers of tests transactions, run them through the results, check
expected results, confirm you've passed all the tests and so on, and do all of this in a low-code
way so that your non-programmers who are writing, providing their domain expertise can also test it
to make sure they haven't broken anything. Now, when it comes to decisions, testing is is
necessary but not sufficient. Because within those decisions, within those business rules, there are
going to be thresholds, places where you make choices as a business or as an organization as to
what that threshold should be. There's not a hard, this is a good threshold. That's a bad threshold.
There's a, it's going to make a difference. So take loans. How much am I willing to lend you to buy a
boat? Well, that's a, there isn't a right answer and a wrong answer in the sense that I can't write a
test case for it. But the business could change that threshold and it has an impact. I need to be
able to track what that impact is. And so generally, we have some kind of impact tool that
takes a bunch of historical data and loads it in and then runs a set of simulations. So very
similar to a test engine, but with a different perspective. Instead of saying this broke, this
didn't break, it says, here's the difference. If you make that change to that rule, the results look
like this. And if you make this change to this rule, the results look like that. So you can see
what the impact of a change is going to be before you make it. So a lot of these tools. Uh, yeah. You
have to deploy and put a test version out before you can do these things, but several actually
allow you to do things like testing and simulation on rules you haven't deployed yet that
are just in your repository, and manage all of that essentially under the covers so that you can
do it inside your development environment So they provide a lot of tools to make sure you have the
rules correct before you deploy them. Now, once you have them correct, obviously you do, in fact, need
to deploy them. So you've got a deployment engine that deploys them as a service. So now I've got my
rules service deployed, my decision service deployed. And it's going to execute those rules.
It's got the code and the engine that it needs to execute those rules. So when you pass in data, it's
going to give you an answer. Now in this case obviously I'm going to expose it as an agent. So
I've probably got some kind of NCP (model contact protocol) server that exposes these decision
services as tools, you know, and then those can then be wrapped into an agent and
exposed in my agent framework. So what is agentic? Uh, yeah. What these agents are going to do?
The origination agent is going to say, here's my data packet. It's going to come in to my decision
service, and I'm going to get a response back, which I then, you know, goes back to my agent. So I
can quickly, uh, package up my rules as decision services. I can reuse rules and reuse logic and
package it up in multiple services, deploy those services, wrap them as agents using MCP. And now
I've got a whole series of decision agents that I'm managing from this repository. The technology
is really good at doing things like I've made a rule change, update the engine, handling in-flight
transactions so that an in-flight transaction doesn't get broken if you change the rules. All of
that kind of, uh, constant update is all handled very effectively. So what this lets me do is it
lets me build these rules, build these decision agents in a very robust way, and then deploy them
as a service that I can then use to support my agentic framework by exposing them as agents. So
yeah, this handles, if you like, most of what's going on in an agent. If you think about these agents,
this is all very prescriptive. So this is really describing how I write rules. How do I
describe the rules, the logic that prescribes how this decision is made. But many decisions have a
probabilistic component too. So, you know, um, probabilistic, I probably spelt
that wrong, but probabilistic elements too. So if it's likely that this is James
will do one thing, and if it's not likely that it's James, if it's someone's impersonating James will
do something different. If it is likely that this is a legitimate transaction, we'll do certain
things. So these are probabilistic elements that are typically built using predictive analytics,
machine learning from my historical data. So in an agentic framework, what does that look like? Well
I typically what I'm going to do is I'm also going to deploy these machine learning components.
So I might have a prediction, for instance, of fraud. How likely is it that this person is the
person who's applying for this loan? I might have another one around, credit risk. How likely are
they to pay us back? And I might have a third one, which is payoff risk. How likely are they to pay
us off early? And all of these agents are used by my origination agent as part of the
origination decision. So I need to be able to consume this. So how do I build those agents? Well,
I'm gonna use a machine learning platform to do that. I'm gonna use machine learning technology to do
it. And generally, with machine learning, you're gonna do some kind of analysis. And this might
be, um, you know, supervised in the sense that there is a human user who
is directing, directing it, or it might be unsupervised, where you're really just using the
algorithm and letting it see what it finds out about your data, which, of course, means you've got
to have data. So generally for machine learning, you have a lot of data so you have multiple
databases that have to be sort of combined and merged and managed. And you're going to do
something called feature engineering. So you're going to engineer a set of features. And features
are, you know, predictive characteristics of one kind or another, things that seem interesting. They
can be very simple. If you have a date of birth, they can come up with an age. They might classify
something. I'm going to say which customer, which age range are you in less than 20, 20 to 30, 30 to
40 and so on? Because the range seems important. But they can get quite sophisticated. They can say
things like, how often have you been more than 30 days late in the last 180 days on a payment for a
bill? Well, that has to be calculated from all this data. So there's a lot of work to not just merge
this data, but calculate these features from it. And then I'm going to feed that data and my
features that I've created into my analysis, run these machine learning algorithms, neural networks,
regression models, decision tree analytics, all sorts of different analytic techniques to see if
I can find patterns or classifications or make predictions based on the historical data that
I've got. It can supervise. I'm telling it what I'm looking for. Can you me which
features will predict that this person will pay off the loan early? And if it's unsupervised, I'm
more looking for things like. Is there anything unusual in here? What counts as an unusual pattern
of data? Because that might be indicative of a new kind of fraud, for instance. And so the supervised
generally driven by a data scientist, by a machine learning engineer. The unsupervised ones, you know,
generally, you know, being kicked off and allowed to do their own piece. And then I'm going to go
ahead and deploy these as, um, as endpoints that can be consumed by these agents. Now, we used to do
a lot of analytics in batch. So we would run these kinds of analyzes and then update the database
with a bunch of scores. Today much more likely to deploy them as individual endpoints, individual
REST endpoints that I can pass a JSON object to to score and get a result back. And obviously once I
do that, once I have an endpoint, I can use MCP again, and I can deploy those as tools that I can
make available to my analytic agent. I now have analytic agents talking to deployed endpoints. And
those endpoints run essentially an algorithm that's been built from my historical data. So
they're not analyzing the historical data at runtime. What they're doing is they're using the
results of that analysis to say, okay, here's a formula that takes this data and calculates a
payoff risk for this customer. So I can see how likely it is this customer is going to pay it off
early and use that as part of my pricing. So I have all these analytic agents, they're deployed
into my, into my into, you know, into my agentic AI framework. And then my decision service is going
to consume the results of those, those predictions, those probabilistic models as part of how it
makes the business decision to originate you or not. Now, these are two types of technologies, decision
platforms and the machine learning platforms, these are quite separate from large language
models. But that doesn't mean they can't be enhanced with large language models. And there's
two areas in particular where we see a lot of work. One of them is this idea of a large language
model for ingestion. if I've got documents. If I've got brochures, if I've got
a recorded conversation, it doesn't matter how I've recorded a bunch of data. But large language
models are really good at extracting the data I need from that. So if I've got an origination
decision and it needs to know, for instance, details of the boat you want to borrow money
about, and I've got a brochure about that boat, then I can ingest that using a large language
model, feed it directly into my origination agent as input data. So this gives me tremendous
opportunity for making it much, much easier to supply the data I need. Often these decision
agents need a lot of data. And so being able to consume documents and turn it into data is very
effective. The other place we've seen are really good um, uses is in explaining results. If you
think about I invoke this decision agent, one of the things it's going to do is it's going to log
how it made the decision. It's going to create essentially a detailed log of how it made the
decision, how much detail goes in that is something that's up to you, but you can look at quite
precisely how the decision was made. Which rules fired? How was the decision made? Now that looks
great for you. It's great for long-term improvement. great for understanding how your
engine worked, how your decision agent behaved. It's not necessarily great for explaining it to a
human being, a call center rep or a customer. So one of the other use cases for LLMs is to take this
log data and turn it into an explanation. So now I can explain how that decision was
made. And I can ingest textual data that you give me. So I can use these LLMs to make it
easier to interact with my decision agent, both in and out. Now there's one last step that I wanted
to add, which is how do I make these things learn if I want them to learn, if I want them to get
better over time? What does that look like? How do I, how do I, get my results to like, you know, have
an upward trajectory? Well, there's a couple of things to say about that. It really varies
depending on the kind of agent you have. Many of the analytic agents will learn on their own
behalf. The unsupervised ones in particular, they'll take new data and continually sort of, you
know, update themselves as new data comes in. They'll update themselves. So as you run them, they
make predictions, they make scores and new data results in new scores. And so they constantly
change their algorithms. Typically you have some guardrails on that. So they can't change too much
without telling someone. But you allow them to essentially run experiments on their own data and
experiment internally so that they evolve as predictions So those those kind of agents, agents
built on unsupervised analytic techniques are inherently learning. But other kinds of analytic
agents, uh, don't learn quite the same way. So you typically then have some data scientist who is
looking at, you know, doing new analysis with new data and proposing a new model. So they might do
this every month, every quarter, every week they review the data up until yesterday. They see what
day has changed since the last time they built the model. They rerun the model and see if the
algorithm is different or noticeably different. if it is, they typically will deploy a new
endpoint. And that gets version to control just like any other code. So any analytic agent can
learn. It might learn automatically, but it might also learn because the data scientists are
responsible for keeping it up to date over time. But what about decision agents But decision
agents don't really learn, Right? The whole point of a decision agent is that it's concrete, right?
That it's got this hardened definition of how it behaves. And so you don't really want to have it
like randomly changing its behavior. So there's a couple of things you can do. First of all, you can,
in the rule repository, you can code multiple versions. So you can put in: here's the old
version of the rules, here's the new version of the rules. And then put a rule in that says some
people get one, some people get the other. And I get to experiment to see which one works better.
So I can run what's called A/B or champion challenger testing by writing rules in my
decision agent so that it looks like one agent to the outside world, but it's got these two versions
that it's it's running comparisons for so I can learn, and then I can have somebody look at the
log, see which one works better, and, you know, close the loop, adding more rules
back into the rule repository. The other thing I can do is I can start to think about the
overall agent and how the overall agent works. And this starts to get more involved. Because if you
think about if I want to improve my origination agent, well, what does it mean to make better
origination decisions? What that means is I lend money to people who pay me back, but they don't
pay me back at once. Right? The whole point of a loan, as you might pay me back over many years. And
so I can't really tell how good you're going to be at paying it back until some time passes. So I
can't do a real-time feedback loop because it's nonsense, right? The idea that I'm going to find
out in real time whether this was a good loan decision is just silly, right? So I have to be get
a log of how I made the decision and log which scores and predictions I used and what version
everything was and stored that in my log. And then I need to wait some period of time, and then
somebody needs to come back and look at all this data and say, well, given this log data and the
versions of the analytics that I use and the results I got out of this origination decision as
processed through my workflow and actually do that analysis work. So that requires a process
and structure that you can follow. You can do it with agentic AI, but you have to be a little bit
more thoughtful about how you would do it. It's not enough just to rely on the individual agents
to learn about their bit of the problem. Someone has to own the framework as a whole, the solution
as a whole, and systematically learn from how well that works.
Ready to become a certified watsonx AI Assistant Engineer? Register now and use code IBMTechYT20 for 20% off of your exam → https://ibm.biz/BdehaJ Learn more about Types of AI agents here → https://ibm.biz/BdehaA 🤖 How do decision agents complement LLMs in automation? Blue Polaris Executive Partner, James Taylor, explores designing decision agents with DMN, LLMs, and machine learning models to create transparent, efficient systems. Discover how these technologies enable scalable decision-making in AI frameworks. 🚀 AI news moves fast. Sign up for a monthly newsletter for AI updates from IBM → https://ibm.biz/Bdehau #llm #machinelearningmodels #automation