Loading video player...
Let's say you've been spending countless
nights in your basement researching,
experimenting, and training. And after
months of effort, you finally built a
super advanced LLM before anyone else.
Now, you want to figure out how to
deploy this LLM and make it available to
the public so people can actually start
using it. Except there's one major
problem. The model you built needs an
infrastructure to actually run it. So,
somehow you have to figure out how to
deploy this model, and you're not
exactly sure how. It's not like you can
just upload it to Google Drive and let
people access it that way. So the
question comes down to this. How do I
deploy this model so that people can
actually start using it? Since your
computer probably can't run this model,
the first thing you might do is to look
up cloud systems like AWS, Google Cloud,
or Azure to host this model in the
cloud. And once you have this proper
infrastructure set up and serve the
model for inference, you can now sit
back and enjoy as people start using
this model for the first time. And
luckily, you got millions and millions
of people excited to use this model. But
here's a problem. you haven't actually
set up the proper way to manage the load
and now requests are starting to slow
down and time out and people are
complaining that this model is not very
stable. Now you realize that you need to
implement a system. A system that helps
you manage this infrastructure that is
running the model. And luckily you find
out there's a system called Kubernetes
that helps you this very problem. And
with Kubernetes you can now set up a
system for load balancing and setting up
a server with a resilient system and
have proper scaling up and down
depending on the demand. And thankfully
with the help of Kubernetes, you can sit
back and allow the machine to run
smoothly on its own. Now the model is
live. Users are flowning in to try out
the model and life is good. And people
using your system are now requesting for
more features. They want updates. They
want fine-tuned versions. And they want
custom APIs for enterprise use cases.
All of which is a huge potential to have
more people use the model as long as you
can actually meet these demands. So even
though you have the infrastructure layer
that runs a model and you have a system
in place that manages the
infrastructure, you realize that you
don't really have a workflow in place
that allows the model to be flexible as
a business case changes. You want to
implement a machine learning workflow
that helps you develop and deploy this
model. So you divide up the workflow in
two large phases, development and
production. For the development phase,
you need to do some data preparation
where you take the raw data that's
typically used to train AI models and do
some feature engineerings to essentially
only extract meaningful data and prepare
them as training data that will be used
to train the model. After the data
preparation, you want to actually start
doing some model development where you
can be creative in creating and
modifying AI models that might be best
suited for what you're trying to do.
Once the model is ready, you need to
have a workflow that can support
actually training the model with the
data that we prepared. And since
training is a huge computationally heavy
step, this workflow also needs to assign
proper GPU loads and spin up and spin
down as needed for training the model.
And finally, your workflow needs to
include model optimization where you can
apply different hyperparameters and
optimize a model before it's final. And
the development phase then passes over
all these tasks to production phase to
then serve the model in production for
people and applications to use this
model. Orchestrating this entire
workflow is not an easy endeavor and
you'll soon find out that CubeFlow is a
system that allows deploying, scaling
and managing AI platforms which in our
case it's exactly what we needed. So now
we have cloud providers serving the
model in their infrastructure.
Kubernetes as a system that manages
containers, scaling, and networking and
now CubeFlow that manages the ML life
cycle. You can now confidently service
this brand new model for people to use
that's agile. Now that we covered the
theory side of where CubeFlow actually
fits in, let's run some labs so that you
can actually try to learn how to use the
system for managing ML workflows.
Welcome to this hands-on lab on Katib.
CubeFlow's powerful hyperparameter
optimization component. Finding the
perfect hyperparameters for machine
learning models can take weeks of manual
experimentation. In this lab, you'll
learn how to automate this entire
process using Kubernetes native
workflows that scale effortlessly. Let's
start by understanding where Katup fits
in the CubeFlow ecosystem. Cubeflow has
three layer architecture designed for
production machine learning. The control
plane manages ML workloads with
specialized components. Kativ handles
hyperparameter tuning pipelines
orchestrate complex workflows and KSERve
deploys model with autoscaling. The data
plane is where actual training jobs
execute on Kubernetes pots leveraging
the cluster's compute resources. The
orchestration layer is Kubernetes itself
providing enterprisegrade capabilities
like parallel experiments, resource
isolation, automatic failure recovery
and multi-tenency. This isn't just
academic. Production companies like
Spotify, PayPal, and Lyft run their ML
platforms on this architecture. After
reviewing this architecture, a quick
knowledge check confirms your
understanding. Now, let's dive into
CATIP's core concepts. Think of an
experiment as a complete hyperparameter
tuning job from start to finish. Each
trial within that experiment tests one
specific combination of parameter
values. The objective is the metric you
want to optimize, whether that's
maximizing accuracy or minimizing loss.
You'll see Katip's internal architecture
diagram showing how the experiment
controller coordinates the trial
controllers and suggestion services.
Understanding these relationships is
crucial. We also compare four popular
search algorithms to help you choose the
right one. Random search simply samples
parameters combinations randomly. It's
simple but doesn't learn from previous
results. Beijian optimization is
smarter. It builds a probabilistic model
of objective function and uses previous
trials to intelligently select the next
parameters to test. Greed search
exhaustively tests every combination in
your search space thorough but
computationally expensive. Hyperband
takes a different approach using
adaptive resource allocation to quickly
eliminate poor performing
configurations. With concepts clear,
it's time to install KTIP. We're
deploying version 0.17 using Python
setup script. The script handles
everything. deploying the CATIP
controllers, database manager, my SQL
for persistence, and the web UI. The
script waits patiently for all pods to
reach a running state, which takes about
2 to 3 minutes, and finally, it
configures a UI as a noteport service,
so you can access it easily from your
browser. Accessing the CATIB web
interface is straightforward. Simply
click the CATIB UI at the top of your
lab interface. The button automatically
includes the correct /catib/path.
Visual guides show you exactly where to
find this button and what the experiment
dashboards look like when it loads.
Don't forget to select the CubeFlow name
space from the drop- down menu. This is
where all your experiments will appear.
Before running any experiments, we
verify your Python environment is ready.
The verification script checks for
essential packages. The CATIB SDK for
pragmatic experimental submissions,
Kubernetes client for cluster
interaction, scikitlearn for machine
learning, and pandas for data
manipulation. Running this check
prevents frustrating runtime errors
later. Now let's understand the anatomy
of a cat tip experiment before we run
one. Every experiment needs four key
elements. The objective function you
want to optimize, a search space
defining the valid range of each
parameter, the optimization algorithm to
use, and the total number of trials to
run. The beauty of Katib is that it runs
these trials in parallel across separate
Kubernetes pods automatically logging
all parameters and metrics. No manual
tracking is required. Time for the
exciting part. Running your first
experiment. Experiment one optimizes a
simple mathematic function. F= 4 * A
minus B ^2. This keeps things simple so
you can focus on understanding Captive's
workflow without the complexity of a
real ML model. Run the provided script
and wait for 4 to 5 minutes for all
trials to complete. The question
includes helpful troubleshooting tips in
case you encounter errors like how to
delete existing experiments if you need
to start fresh. Once your experiment
finishes, the real learning begins with
visualization. The CATIB UI provides
powerful insights into your experiment
results. Following the step-by-step
visualization guide, open the CATIB UI.
Select the CubeFlow name space. Locate
your simple math experiment in the list.
View the trial tables showing all
parameter combinations. Examine the
results graph plotting objective values
across trials. Click individual trials
to see detailed execution logs and
identify the best performing trial. The
visualization makes it crystal clear how
different parameter combinations
performed for this function. The best
parameters should be a equals 20 and b
approximately 0.1 yielding a result near
80. Congratulations, you've successfully
run and visualized your first catip
experiment. You now understand the core
workflow, but we've included two
additional operational experiments for
those who want to go deeper on their own
time. Experiment 2 compares random
search versus Beijian optimization on
the same objective function, letting you
see firsthand how Beijian's intelligence
exploration converges faster to better
result. Experiment 3 tackles a real
world problems, optimizing logistic
regression classifier for SMS spam
deletion. It optimizes Captib's
integration with Psyche Learn and
optimizes practical hyperparameters like
regularization strength and train tests
split ratio. Both experiments include
complete working examples and take four
to five minutes to run. These are
completely optional enrichment
activities not required to complete the
lab. The optional experiment section
provides all the details if you decide
to try them. Experiment 2 runs both
algorithms side by side so you can
directly compare their search strategies
and convergence patterns. Experiment 3
gives you a hands-on experiment with
real ML workflows, showing how to
structure your training code for CATIB
and what hyperparameters matter most for
production models. Congratulations on
completing this CATI lab. You've gained
valuable skills in CubeFlow
architecture. Catip experiment design,
Kubernetes based ML workflows, and
results visualizations. These aren't
just theoretical concepts. These are the
exact same tools and techniques used by
data science teams at major tech
companies. You can now automate
hyperparameter search instead of tuning
manually. Optimize ML models for
production deployment with confidence
and run distributed experiments that
scale across large Kubernetes clusters.
The two optional experiments are waiting
whenever you're ready to deepen your
expertise. Great work.
🧪 Kubeflow Labs for Free: https://kode.wiki/3LLSUj3 Learn how to deploy and manage machine learning models at scale using Kubeflow and Kubernetes. This complete beginner-friendly tutorial covers the entire ML workflow from infrastructure setup to automated hyperparameter tuning with Katib. 🎯 What You'll Learn: • Understanding ML model deployment challenges • How Kubernetes manages ML infrastructure • Kubeflow's role in ML lifecycle management • Hands-on Katib hyperparameter optimization • Running automated ML experiments at scale ⏰Topics Covered: 00:00 - Introduction: ML Deployment Challenges 00:43 - Cloud Infrastructure Setup (AWS, GCP, Azure) 01:12 - Kubernetes for ML Infrastructure Management 02:12 - Kubeflow ML Workflow Architecture 03:40 - Development vs Production Phases 04:39 - Katib Hyperparameter Optimization 05:55 - Katib Architecture & Components 06:59 - Search Algorithms Comparison 07:44 - Installing Katib on Kubernetes 08:58 - Running Your First Experiment 09:52 - Visualizing Results & Best Practices 🧪 Kubeflow Labs for Free: https://kode.wiki/3LLSUj3 🔧 Technologies Covered: • Kubeflow & Katib • Kubernetes • MLOps & ML Pipelines • Hyperparameter Tuning 💼 Real-World Applications: Used by companies like Spotify, PayPal, and Lyft for production ML platforms. 🎓 Perfect for: ✓ ML Engineers starting with MLOps ✓ Data Scientists deploying models ✓ DevOps Engineers managing ML infrastructure ✓ Anyone learning Kubernetes-based ML workflows #Kubeflow #MachineLearning #Kubernetes #MLOps #DataScience