Loading video player...
I think by now most of us are familiar
with the term prompt engineering.
It's the process of crafting the input text used to prompt a large language model,
including instructions and examples and formatting cues. It's
what steers the LLM's behavior and output. Now,
context engineering, on the other hand,
is the broader discipline of programmatically
assembling everything the LLM sees during inference.
Now that includes prompts, but also retrieve documents and memory and tools—everything
needed to deliver accurate responses. So,
to demonstrate the difference,
let me introduce you to an agentic
AI model that I like to call Graeme.
Secret Agent Graeme.
Agent Graeme specializes in travel booking. So, if I send Graeme
this prompt, "Book me a hotel in Paris
for the DevOps conference next month," well,
the agent responds with "Sure thing.
The Best Western Paris Inn has great
wifi and free parking.
It's booked." Cool.
But the only trouble is the Best
Western is located in Paris, Kentucky, and
that DevOps conference is in Paris, France.
Now, you could argue that that's a failing of prompt engineering.
I wasn't specific on the location. But
it could also be seen
as a failing of context engineering
because if Agent Graeme here was just a little smarter, well,
they could have used a tool to check my calendar
or look up the conference online to find the right location.
So ... so let's try again with a follow-up prompt.
My conference is in Paris, France.
€900 a night. Ritz booked.
Champagne. Breakfast included.
Well, uh, wish me luck getting that one
approved through my company expense reimbursement system.
But Graeme here can't really be blamed for that one
because I didn't provide sufficient context.
I should have made my company's travel policy available to the agent.
Perhaps there's a JSON file specifying things
like maximum permissible hotel rates for the area.
So, prompt engineering—that's
the craft of wording the instruction itself.
And context engineering is the system-level
discipline of providing the model
with what it needs to plausibly accomplish the task.
So let's take a look at these two terms a bit closer.
And we'll start with the key techniques
that make prompt engineering effective.
Now this is part art, part science.
But there are several prompt engineering techniques
that are now widely adopted. So,
take for example, role assignment.
This tells the LLM who it should be.
So, you are a senior Python developer
reviewing code for security vulnerabilities. Well,
that produces vastly different outputs
than a more generic code review request.
The model adopts the expertise, the vocabulary
and the concerns of that persona that we asked for. Uh
... another good technique
comes down to few shot examples. So,
this is show, don't
just tell. So,
providing 2 or 3 examples of input/output pairs ...
that helps the model understand your exact format and style requirements.
So, if you want JSON output with specific field names, well,
show it. Show it in the examples. Now,
before we had reasoning models trained on reinforcement
learning, a pretty popular prompt engineering
technique was called COT
or chain of thought prompting.
Now this forces the model to show its work. Adding
"let's think step by step" or "explain your reasoning"—that
prevents the LLM from jumping to conclusions.
And it's particularly powerful for complex reasoning tasks.
And then another technique is called
constraint setting.
Here you define boundaries explicitly. So,
"limit your response to only 100 words"
or "only use information from the provided context".
And that helps prevent the model
from going off on tangents.
Context engineering—that helps build dynamic, agentic
systems to orchestrate the entire agentic environment.
And let's take a look at some of the components of that.
Well, agentic AI.
First of all, it needs memory.
And memory management can be thought of in two
forms. So there's short-term memory ... that
might involve summarizing long conversations
to stay within context windows so that past conversations are not forgotten.
And then there's also long-term memory, and
that uses vector databases to retrieve things
like user preferences and past trips and learned patterns.
Then there is state management. Now,
this says where are we in a multi-step process? So,
if an agent is booking a complete trip—the flight,
the hotel, the ground transportation, all of it—
well, the agent needs to maintain state across these operations.
Did the flight booking succeed?
What's the arrival time for scheduling
the airport transfer? Stuff like that.
So, state ensures that the agent doesn't lose context mid task. Now,
another important component is retrieval augmented
generation or RAG,
that connects an agent to dynamic knowledge sources.
So, RAG uses hybrid search which combines
semantic and keyword matching based on context.
So, when retrieving your company's travel policy,
RAG isn't returning the entire travel policy document.
There's a lot of stuff that's just kind of irrelevant to the context in there.
So instead, it's picking out the relevant sections and the relevant exceptions
and returning only those contextually relevant parts
back to the agent.
And agents also need access to tools
so they can actually go out and do stuff.
So LLMs by themselves,
they can't check real databases or call APIs or execute code.
It's tools that bridge that gap, and a tool
might query a SQL database, or it might fetch live
pricing data, it might deploy infrastructure.
And where context engineering comes in
is in defining the interfaces that guide
the LLM toward the correct usage. And tool descriptions—
they specify what the tool does,
when to use it, and what constraints
apply. And prompt engineering? Well,
actually we should include that as well
because that is also part of
context engineering.
You can take a base
written prompt like "analyze security logs for anomalies".
You can take that as your prompt and then at runtime,
inject the prompt with current context,
like recent alerts and known false positives.
And all of those variables in the prompt,
they get populated from the states
and the memory and the RAG
retrievals. So, that final prompt might be 80% dynamic content from there
and 20% static instructions. So,
prompt engineering ... it gives you better questions.
Context engineering—that gives you better systems
when you combine them properly.
Hotel booked.
Paris France.
Under budget. Near the venue.
Excellent. Thank you, Agent Graeme.
Pending approval from your manager, HR and finance.
Estimated approval time:
6 to 8 weeks.
Uh, the conference is in two weeks.
Have you tried prompt engineering your manager?
Sign up to attend IBM TechXchange 2025 in Orlando → https://ibm.biz/Bdej4m Learn more about Prompt Engineering here → https://ibm.biz/BdeYDZ Context engineering builds smarter AI systems by orchestrating RAG, memory, and agent collaboration 🔍. Martin Keen explains how multi-agent systems combine techniques like role assignment and chain-of-thought prompting for accurate, dynamic responses. Learn to design better AI workflows today! AI news moves fast. Sign up for a monthly newsletter for AI updates from IBM → https://ibm.biz/BdeYDY #promptengineering #ai #aiagents