Loading video player...
Why does this driverless car that I'm in keep driving around this parking lot in circles?
Is this a bug in the software or somebody playing a joke on me?
Because I can't figure out how to make this thing stop. You can probably
tell I'm not really in an autonomous vehicle right now, but this scenario has actually happened
to people, and it's a little bit scary. AI ... AI agents, like driverless
cars, need regular evaluation to make sure they're safe and effective. AI agents are
goal-based systems that use LLMs to act autonomously and carry out tasks. Users
set high-level goals, but they don't need to give explicit instructions every step of the way. The
agent decides how it accomplishes these goals. So what can organizations do to rein in
AI agents and make sure they're aligned with their intentions? If we're going to depend on AI
agents, we need them to be reliable. This is where a governance framework comes in.
How can we build a framework to make sure our agents act in ways that uphold our values?
I'll walk you through a framework design process focusing on five pillars of agentic AI
governance considering policies, processes and controls for each pillar.
Our first governance pillar is alignment.
An alignment strategy establishes trust that our agents behave consistently with our values and
Intentions.
Things that we can do to create agentic alignment are: create a code of ethics.
This states the organization's values, ethics and standards of conduct. This should be embedded
into every agent development project. Define metrics and tests
for detecting goal drift. These tests can be run before deployment and then regularly afterwards
to make sure agents stay aligned with intentions. Assemble a governance review
board to make sure agents comply with regulations like
the EU AI act and to review test results and approve deployments.
Automate audits that check agent outputs against specifications.
We can also create risk profiles based on
organizational risk preferences and then encode these into agent parameters during development.
Our second pillar is control.
A control strategy will make sure our agents operate with predefined boundaries.
Make an action authorization policy.
Delineate which actions agents can take autonomously and which require a human in the
loop. Build a tool catalog to make sure
only approved tools are used by agents. And these tools might include things like databases,
APIs and plug-ins. A tool catalog can also capture tool lineage,
helping us know which agents are using which tools. We can also conduct shutdown and
rollback drills to test intervention speed and rollback procedures with
simulations of agent misbehavior. Design a kill switch mechanism,
including soft stops for orderly shutdowns and hard stops for emergency termination at the
orchestration layer. Keep activity logs that record
every agent action as well as inputs and outputs so you can reverse or modify these if
needed. Our third pillar is visibility.
Visibility strategies make AI agents' actions observable and understandable.
Assign unique agent IDs to every agent so
we can trace behavior across environments. Define an incident investigation protocol
with clear steps when unexpected actions happen, from log retrieval to root cause analysis.
Evaluate cooperation capabilities between agents by
automating continuous testing for multi-agent interactions. Assess how agents are
cooperating to detect coordination failures before they impact users.
Our fourth governance pillar is security.
Security strategies protect data, keeps us secure from external threats
and ensure reliable performance.
Create a threat modeling framework
to help identify and mitigate potential security threats like prompt injections, adversarial inputs,
and vulnerabilities. Build a sandboxed environment
so agents can run an isolated, monitored environment that prevents unauthorized access and
data transmission. Do regular adversarial testing.
Challenge agents with adversarial inputs to evaluate resilience and make sure they perform
well if they're attacked. We can also build in access controls
so only authorized users can access agents to provide instructions. The fifth and
last pillar of our governance framework is societal integration.
Societal integration addresses issues like agent accountability,
inequality and concentration of power while supporting harmonious
integration.
Define an accountability strategy that outlines legal
responsibility among developers, business owners, auditors and users. Create
a plan for regulatory engagement
to maintain active dialogs with industries and regulators, to shape standards.
Build a legal rules engine that enacts legislation checks
so agents automatically vet proposed actions against laws. Interestingly, we could build
specialized governance agents to automate some of these governance tasks and enforce policies for
us. This framework is not a one-size-fits-all solution. It's adaptable, so it can be
modified to fit a wide variety of organizational goals and strategies. One thing I want to
highlight about agentic AI governance is that it's a continuous, evolving process and not just a
one-time checklist. So your framework should be iterated upon as agents and regulations continue
to change because AI agents will continue to grow in capabilities.
Ready to become a certified Architect - Cloud Pak for Data? Register now and use code IBMTechYT20 for 20% off of your exam → https://ibm.biz/Bdenya Learn more about AI agent Governance here → https://ibm.biz/BdenyG Are AI agents trustworthy? 🤔 Amanda Winkles explores how to build an AI agent governance framework focused on alignment, control, visibility, security, and societal integration. Discover actionable strategies to ensure AI agents are ethical, reliable, and secure in evolving environments. 🌐 AI news moves fast. Sign up for a monthly newsletter for AI updates from IBM → https://ibm.biz/BdeVet #ai #aiagents #aiframeworks