Loading video player...
Whether you're building an experimental prototype for your own personal use, or creating an
application to power an entire organization, there are key components of the AI technology stack
that you must get right to build AI systems that can do more than just generate answers but solve
real, meaningful problems. Say, for instance, I'm building an AI-powered application to help drug
discovery researchers understand and analyze the latest scientific papers in their domain. Maybe it
starts with a model that I recently heard about that is supposed to be better
at highly complex tasks like that of a PhD researcher. Model is an important layer of the
stack, but it's just one piece of the puzzle. There's also the infrastructure that that model
will run on, because not all LLMs, large language models, can run on
standard enterprise CPU-based servers, and not all are small enough to run on a laptop. So it
matters what infrastructure you have access to and how you choose to deploy it. Next is data
because in this example, the whole point is to help scientists understand the latest papers in
their field. And models typically have a knowledge cutoff date. So if we want to talk about papers
from, say, the past three months, that means we have to provide the AI system with extra data.
That will be the data layer. Next would be the orchestration layer. Because to do
a complex task like this probably is going to require more than simply providing a large prompt
into the AI system and getting an output, a single output, out. Instead, we'll want to break that user
query up into different parts. Help um, plan how the AI solution is going to actually
tackle this problem, what data it needs, and then do the summarization and creating an answer and
maybe even review that answer. Finally is the application layer. And this is because at the
end of the day, there's a user using this tool. So there will have to be an interface that defines
what the inputs will be and what the outputs will be. It might not be as simple as text in and text
out. And there's also the issue of integrations. So, will the actual results of this be something
that's integrated into other tools that this user uses? It's important to understand
all the layers of the AI stack, whether you're building a solution from scratch or using
solutions which might manage several of these layers for you as a service. This is because
across the stack, from the hardware, all the way up to the user interface level, the choices you make
will have important implications on your solution's quality, itsspeed, its cost and its
safety. When it comes to infrastructure, LLMs generally require AI-specific hardware,
specifically GPUs, and these can be deployed in one of three ways. The first would be on premise,
that is, assuming you have the means and resource to buy this kind of infrastructure yourself.
Second option would be cloud, and that would allow you to rent this capacity and be able to scale it
up or down as needed. Finally would be local, which usually means on your
laptop. Not all lap ... laptops can support LLMs of different sizes, but there are certainly LLMs on
the smaller end of the range that can be run on the kind of GPUs available in a standard laptop.
The next layer is models. So AI builders have plenty of choice when it comes to what model they
can use. One dimension to consider is whether the model is open
versus proprietary. Another dimension is the model
size. So we have large language models; we also have small language models that might
be lighter weight and able to fit on more lightweight hardware,uh, but might not have exactly
the same thinking capacity as a large language model and instead be specialized for more
specific things. Finally is specialization.
Which sometimes goes hand in hand with size. Some models might perform better on things like
reasoning or tool calling or generating code. Others might have different language strengths
than others. There are plenty of new models over 2 million already in model
catalogs, like Hugging Face that can serve any mix of these different needs that an AI builder might
have. The next layer of the stack is data. This breaks up into a few different components,
so the first would be data sources themselves to supplement the model's knowledge.
This could also include the pipelines to do any processing,
pre-processing, post-processing of that data, as well as vector
databases you may use. Or retrieval systems,
also known as RAG. Vector databases is the step where that external data is actually vectorized
into embeddings that are saved so your model can retrieve that context more quickly and augment it
with this additional knowledge that the base model does not have. That's important because base
models are usually trained on publicly available information, which might not always be complete to
accomplish the task that you have. You might need to supplement with additional data. The next layer
is orchestration, because building an AI system that does something more complex than
just generating text or answering questions requires breaking the initial user input down
into smaller tasks. Those can start with things like thinking,
using the model's reasoning ability to plan out how it will tackle the problem. That
can also include things like execution, where the model does tool calling
or function calling, as well as steps like
reviewing, where an LLM can actually provide its own critique of the initial generated
responses and initiate feedback loops to even improve those responses. This layer
is very quickly evolving, with new protocols like MCP and new architectures for how to best
orchestrate increasingly complex tasks.
Next is the application layer, so the most widely used AI systems do follow a pretty simple design of text in and text
out. But as we use these tools in our work in life, there are important features that become critical
for the actual usability of AI and these factors make up the application layer.
First factor is interfaces.
The most classic interface is text in and text out,
but there are other modalities that can be very valuable for certain tasks too, like image, audio,
numerical data sets, and plenty of other custom data formats.
Also, in the interface,
it's really important to keep in mind the ability to do things like revisions or
citations so that when the user sees what the model comes up with, they have the ability to edit
that or inquire on it further. The second consideration is integrations, and that
comes both in the form of integrations, of allowing other tools that the user uses to
actually send inputs to the AI system, or to take the
model outputs and automate how that gets integrated into some of the tools that they use
in their day-to-day work. All together, these layers of the AI stack, from the
hardware to the models, the data you use, how you orchestrate it, and the application and the
usability of it, matter because when we have a clear understanding of how they fit together, we
can see what's truly possible and make practical choices to design AI systems that are reliable,
effective and aligned to our real-world needs.
Ready to become a certified Certified watsonx Generative AI Engineer? Register now and use code IBMTechYT20 for 20% off of your exam ā https://ibm.biz/Bdb2vE Learn more about What is an AI Stack here ā https://ibm.biz/Bdb2vH š What is an AI stack and why does it matter? Lauren McHugh Olende explains how LLMs, vector databases, and orchestration layers integrate with AI hardware to power real-world systems. Discover how these components enable smarter workflows and reliable AI solutions. š¤āØ AI news moves fast. Sign up for a monthly newsletter for AI updates from IBM ā https://ibm.biz/Bdb2vr #aistack #llm #ai #machinelearning