Build Your OWN MCP Server for Kubernetes | AI Integration Crash Course | DailyDevLists

Loading video player...

Full Transcript

10,917 words • EN

Hey guys, welcome to another Kubernetes

hands-on training. I've been looking

into the model context protocol by

Antropic for quite some time now and I

decided to make a video tutorial on it

um to solidify my knowledge and share it

with the community and today I'm going

to build something that connects two

cutting edge technologies, the MCP

protocol and the Kubernetes operators. I

hope you have enough time to follow

along. Let's dive right in.

[Music]

In this tutorial, we're going to build a

Kubernetes operator for managing MCP

servers. And before I dive deep into the

details, I need to clear some basics.

First, we need to talk about the MCP

product, its architecture, and basically

understand what problem it solves. Why

do a uh application use MCP server to

connect to their data sources and tools?

Then we're going to look at the MCP

server operator that aims to automate

the life cycle of our MCP servers. By

the way, I forgot to mention um this

readme and all the instructions in it

and all the text and the resulting code

that we will be building together will

be put in a repository. I'll put the

link to that repository in the

description part of the video. So now

let's talk about the MCP protocol. So

what is MCP? MCP or the model context

protocol is an open-source standard or

protocol

uh developed by entropic that enables AI

application to connect to uh external

services, tools, data sources

that already exist are running somewhere

in the cloud on prem on a server

somewhere

and uh provide textual content text

generated out of those tools and

services and give that uh textual

context to the AI application or the

language model. You can think of it as

USB for AI integration.

So now let's talk about what problem MCP

solves. Um before MCP was introduced by

Anthropic, uh when people wanted to

connect their AI applications and the

language models embedded in those AI

applications to their existing um tools

or systems,

they needed to build a custom

integration between that AI application

and their tool. For every

um pair of AI application and tool,

there needed to be a custom integration.

On the left side, you can think of um

the GPT model from OpenAI, Gemini, from

Google Cloud, from uh Entropic and so

on. And on the right side uh you have

your APIs, your custom tools that you

want to connect them to your um language

model. You want to somehow implement

some sort of automation

for those APIs with the help of language

models. Uh if you had m AI applications

and n tools on the right side, you

needed to build m byn

uh custom integrations.

So what MCP does is that it reduces the

complexity of this problem into

an M plusN problem. So you still have

the uh applications or different

language models on the left side, but

this time you build the standardized

um universally accessible server, an MCP

server of your tool and then you can

connect it to either Gemini or uh

OpenAI.

Now let's talk about the MCP

architecture.

Similar to web applications, MCP also

follows a client server architecture.

But we have three main building blocks.

We have the host or the AI application

so to say. You can think of it like

cloud uh copilot in VS code or cursor

chat and so on. Then we have the MCP

client

which is a software component inside

your AI application that maintains a

connection to an MCP server and receives

the context from the MCP server. And

then we have the MCP server itself which

uh provides context tools and

capabilities to MCP clients. Those

capabilities or types of context that

are provided by the MCP server are

called resources, tools and prompts. So

what are those? Um resources as the name

suggests is a static data or content

that just gives the application or

language model just some more

information, some more useful relevant

information and helps it generate more

useful content. Then we have the tools

which is most probably the most uh

important primitive out of the three

which allows the AI application to

perform actions and execute those uh

tools. Then we have prompts which as you

can guess is a set of predefined

templates for interacting with AI. So

let's say if your application is about

solving certain things or doing certain

things let's say in a specific scope in

finance in in law in in medicine you can

define predefined templates that

whenever is needed by the AI application

it knows what would be a a good to-do

list to get that certain task done. It

helps the AI approach problems more

systematically.

All right, let's now talk about why MCB

matters for Kubernetes.

Before we answer this, um, let's zoom

out a bit and first answer why MCB

matters at all.

So as we talked about MCP enables AI

applications to call tools which as we

saw are basically wrappers around our

APIs and do things on our behalf which

is pretty amazing. But if you want to

now apply this tool to Kubernetes

what does it mean like what problem does

Kubernetes has that we want to apply MCP

on it? Kubernetes is also a ton of APIs.

It's a pretty complex

orchestration platform.

It has a lot of moving parts and it it

evolves uh continuously.

A lot of those APIs are built as

abstractions to hide the complexity of

the underlying infrastructure of the

developers or the users of those API.

But in the end, those APIs, this this

this bag of APIs is pretty complex and

imposes a huge cognitive load makes

Kubernetes not easy to handle or not

easy to digest so to say. If we build

the right primitives in our MCP server

for Kubernetes API, we can make MCP

query our Kubernetes cluster state,

deploy and manage workloads and so on.

So think of workloads pods for example.

If you want to create a pod you need to

call an API the pod API from the core

API group. If you want to create a

deployment or do anything with the

deployment you need to call the

deployment API from the apps API group.

So all of your operations will have to

go through the Kubernetes API.

All of your operations are API calls. So

we can make AI do these API calls for

us. So if you build the right MCP

server, we can query the cluster state.

We can troubleshoot issues within our

cluster with the help of AI and

basically automate operations based on

the events and what happens in our

cluster. So we're going to start by

building a very simple MCP server for

Kubernetes

which basically can operate or

troubleshoot pods. And then we're going

to make that MCP server

a first class citizen in the Kubernetes

world which basically means it's going

to be it's going to turn into Kubernetes

API itself. So it's going to we're going

to build an operator for it. Let

Kubernetes manage its life cycle. So now

let's talk about the resources that

we're going to build for MCB server. So

now let's look at a sample resource that

we're going to define for our Kubernetes

MCP server. Uh here I'm using TypeScript

uh and the MCP um TypeScript library.

The server construct basically is a

construct defined uh or imported from

the MCP TypeScript library. It has a

method called register resource that we

can use to register our resources.

A good example for a resource when it

comes to implementing MCP server for

Kubernetes is the cluster information.

So here basically give it a name and

then designate a um URL path for our uh

resource in our MCP server. So whenever

the AI application wants to query the

cluster information and it calls this

URL path uh of our MCP server and uh

then we have the metadata block. So

basically tell the AI application the

MCP server tells the AI application what

this resource is about and what is the

output format. uh basically use mime

type application JSON to tell our

application that the output is in JSON

format and then whenever the application

calls this resource or requests this

resource from our MCP server uh the this

content is going to be returned like

we're going to return to the AI

application the number of nodes the

number of pods and the version of our

cluster it is important to note that

resources are read only. So when AI

application calls them they don't have

any side effects on the uh state in our

case the cluster state and also they

should be item potential. So which

basically means you can call them as

many times as you want. The outcome in

our case might change because the number

of pods or nodes can change. Kubernetes

is dynamic environment but calling the

resources does not have any effect on

getting a different result that what

that's what makes it ident.

So now here we're going to look at the

MCP tools. Um this is a sample tool. Uh

as you saw in the resources there is no

input. We just query the state. But for

tools we have inputs. So is we again now

we have the server construct we call the

register tool method and uh we give it a

name against some metadata and this time

we expect some input from the AI model

from the AI application

and this input has a format has a schema

uh it needs to be string the name image

and namespace. So we want if whenever we

want to create a a pod the base minimum

uh input that we expect is the name of

the pod the image that the pot will be

running and the name space in which name

space the pot needs to be created. The

rest of the stuff will be assumed by our

MCP server. So our create pod takes care

of uh putting the right uh security

context or uh putting the right uh

resource requests or limits and so on.

Uh whenever the uh tool is called a uh

Kubernetes client, Kubernetes API client

in Typescript in our case will be called

to do the actual job to actually call

the Kubernetes API to create the pot. Uh

so as we see here Kubernetes API will

create a namespace pod

um uh with the right image and the right

name and in the end it returns this

content of type text not application

JSON. It will just tell the AI

application that I created a pod with

this name in name space. Uh, as we see

tools have side effects and they mutate

the state of our clusters. And here's an

example of a prompt very similar. Um,

prompts uh have uh also a name, a

metadata and uh the arguments that needs

to be uh basically fed into a prompt. In

prompts, MCP does not call them inputs.

it MCP calls them arcs or arguments but

very similar to inputs we have for

example for this troubleshooting a pod

prompt um we have a pod name and the

name space as the arguments that can be

replaced in this generic prompt so

whenever for example in case of

troubleshooting whenever the AI

application decides to troubleshoot a

certain pod it already has a predefined

um prompt or to-do list how to

troubleshoot a pod or where to look into

to troubleshoot a pod. So this prompt

basically defines if you want to

troubleshoot the pot uh with this name

in this name space check the status and

events container logs resource request

limits and so on. So it gives a very

nice um predefined to-do to the AI

application to approach the problem

systematically.

All the MCP messages exchanged between

the AI application and our MCP server

are sent over the JSON RPC 2.0 protocol.

So MCP is a protocol itself. Under the

hood, it uses this JSON RPC 2.0 protocol

which is a stateless

uh RPC or remote procedure call protocol

that uses JSON as a data format. The

transports that are supported by the MCP

protocol as of now are SCDio which

basically means standard uh IO or input

output. The AI application and the MCP

server are both running on the same

machine. the MCP server um writes to the

standard IO and the AI application has

access to the same standard IO of the

same uh of that machine and can receive

the messages. Another option is

HTTP/server

send events which uh was initially

introduced by Antropic when they

introduced the MCV protocol which

basically means you

uh deploy your

MCP server just like a typical HTTP web

server somewhere and then you provide

some URL to the to the AI applications

or to the clients of that that MCP

server.

and the the AI applications

can configure themselves to call the

remote uh MCP server. Uh the third

option which is a more modern approach

to the HTTP uh serverside event is the

streamable HTTP

uh which allows the streaming the

generated text from the MCP server to

the AI application similar to what we

see in modern

AI chat environments like chat or cloud.

Next, let's talk about the um a session.

The concept of session in MCP. Whenever

the AI application wants to connect to

an MCP server, it calls the initialize

method of the MCP server

and uh then the MCP server negotiates

its capabilities or communicates its

capabilities to the AI application and

basically tells it for example I have

these tools, resources, prompts and so

on and um it doesn't have to then the AI

application

uh tries to list what the is this MCP

server capable of and this will be sort

of a knowledge for the AI application

and at the time of active usage whenever

the language model or the AI application

um decides to

call the MCP servers or uh for tools or

resources or prompts the call or read

operations are performed

and in the end uh there is a close or

clean up um operation that uh closes the

session between the application and the

MCP server. MCP protocol is a stateful

protocol which basically means the state

of your um protocol is the communication

or the generated text and JSON and and

communicated between the AI application

and the MCP server. Here's some real

world uh MCP implementations that are

already developed and built by the

community. Uh we have MCP servers for

GitHub uh for automated code operations,

Postgress, Slack, uh Kubernetes,

uh and for file system operations. In

case of Kubernetes,

uh Kubernetes has a lot of APIs. it you

can really think of it as a huge bag of

APIs and it doesn't really make much

sense to have one MCP server for all of

Kubernetes

because then we are going to provide too

much context to the language model

and researchers show that too much

context makes the AI or language models

uh performers worse. So uh there are

Kubernet FCP servers for the vanilla

Kubernetes APIs and uh MCP servers for

third parties like there's a MCP server

for Argo CD or Flux and so on and um

usually the

third party uh tools that extend the

Kubernetes API now to make it easier to

manage their own API they build an MCP

server so that you can create for

example Argo rollouts with the help of

AI or you can uh perform Argo operations

with the help of the MCP server built

for Argo CD. So now enough with the

theory let's get into the hands-on part.

Great. Now let us build our first MCP

server. Before we do that we need to

make sure that our dev environment is

properly set up for development. We need

node npm and a simple kubernetes cluster

like kind or mini cube. I'm using kind.

And uh you need to make sure that your

node version is also 18 or later. So um

I have all the commands that we need to

check stuff or uh install stuff.

To make my life easier, I have installed

a an extension on cursor. You can also

install it on VS code. um which

basically helps me run the commands

easily by by clicking on this run button

here. So if I search for markdown,

this is the extension that I have

installed. You can also install it on uh

your editor. Let's go back to the code.

So I'm going to run this block of bash

and see the results. You see I have node

version 2020 and I have access to a kind

cluster which was created 5 minutes ago.

Everything looks

good. So let's move on to the next part.

The next part we want to create the

structure of our repository.

So, we're going to put a create a

directory called workspace and put an

MCPL lab directory in there and uh

initialize a a node module

by running that. So, if I do that, I

have my workspace mcpl and package json.

And um there's a shortcoming of this

extension when if I do cd commands in

it, it basically does it in a temporary

bash session and uh it does not persist.

I have to manually

cd into that

uh directory.

Cool. Now we need to install the

dependencies. The dependencies we need

are

the MCP SDK, the the ZOD library for

schema validation,

the Kubernetes client library in

TypeScript. We also need um Express

because we're going to build an HTTP

streamable version of our MCP server.

We are using TypeScript and we need the

types of those libraries uh for our

application to work properly. So I'm

going to click on this run button here

also

and make sure I install all the

dependencies properly.

And uh if you move on, we're going to

now create a boilerplate package JSON

for our node module. We're going to

create some common handlers for our MCP

server. And then we're going to build

our MCP servers uh in two different

transport modes. We're going to uh build

an MCP server that works with SDIO and

we're going to build an MCP server that

is HTTP streamable. And for that the

most important part of this package.json

is the scripts directory. Uh we have

separated the um compilation and um

start commands of uh those different MCP

server types.

So if we basically do run npm rundev, we

compile our stdio.io dev server. And if

you do mpm rundev http, we're going to

compile the http

streamable mcp server. I'm going to

simply run this command again to make

sure that I have the

uh

package json that I need.

And uh then I also need typescript um a

typescript config. I'm not going to

focus on u the different fields in the

TypeScript configuration. It is beyond

the scope of this tutorial. But

importantly for us is the root directory

and source directory. We are going to um

for compiling TypeScript, we're going to

look at whatever is in the source

directory and the out outcome of the

TypeScript uh compilation will be stored

in the um distro

uh directory. So I'm going to run this

command and create our TS config.

Awesome. Now um we have all the files

that we need for node and typescript.

And now we need to create our repository

structure or project structure. Uh as I

said we need a source directory that

contains uh basically all our all our

sources and we can also create

directories for examples or some

configurations that we want to apply to

our code. Let's just simply run this

command. And now we see we have the

configuration and the examples and the

source basically the repository of our

project. And now to make sure that we

have all the dependencies installed and

everything looks good, we can run this

block. So we do npm installed on these

package JSON and uh check the versions

of the installed packages to verify the

successful installation.

Also to check the connectivity to the

Kubernetes API and the permission basic

permissions, we can run this block of

code. If I have simply created my kind

cluster by uh doing kind create cluster

which basically means the credentials or

the cube config I get has the uh basic

access to uh nodes, pods and so on. and

uh our MCP server uh is going to use uh

the cube config the default cube config

of our client cluster. So it makes sense

that the to check that our credentials

have access to the po pods and nodes

API. So I'm going to run this and you

see that I have access to a cluster. I

can get the pods. I can get nodes and uh

so on. So we did set up everything in

our environment. Now uh I'm not going to

run this uh block of echo commands but

basically we did install a NodeJS

TypeScript MCPS SDK and we make sure we

made sure that our Kubernetes is

accessible and up and running. Perfect.

Now let's focus on the handlers or the

common reusable uh shared code that

we're going to implement for our MCP

server. All right. Now we want to build

the handlers or the common reusable code

for our MCP server. We need the server

class from the MCP SDK. We need the

Kubernetes client uh in TypeScript.

We're going to import everything and use

the K8 alias for it. And we need the

schemas for the primitives that our MCP

server is going to support. So we need

to uh be able to uh list resources, read

resources, list tools like the AI

application needs to first know about

all the possible tools and resources and

prompts and then also need to be able to

read the resources or call tools or get

prompts. uh that's why we we need to

import all the schemas from the types of

our MCP SDK

and then we need to initialize the

uh Kubernetes client. Uh we create a new

instance and then when we call the load

from default we're going to use the

default um uh cube config that exists on

my machine config that has the API

server address in it. it has the context

set and the credentials that is going to

be used to talk to a certain API server.

So we're just going to simply load it.

Then we're going to instantiate a

Kubernetes client for the core group to

create pods and so on and ais client for

the apps group for deployments.

All right. Now let's look into the

resource handlers.

As we said the MCP server supports three

different primitives. the resources, the

tools and the prompts.

For us to uh enable our MCP server to

support these, we need to um register

handlers for different types of

primitives and the different methods

that can be run on those primitives. So

the AI application might reach out to

the MCP server and ask for uh the list

of resources that the MCP server

supports. Whenever that's the case, the

server needs to have needs to have a

proper handler for that. So when the AI

application tells the MCP server what

are the resources that you provide, uh

our MCP server is going to reply and

these are the resources. I have two

resource types. Basically just returns

uh some metadata around the um resources

that it supports. uh our MCP server can

tell us about the nodes in the cluster

and also can uh tell us about the pods

running in the default name space. The

mime type application JSON. So then the

AI application knows whenever it calls

these resources should expect a an a

JSON formatted text that contains

information about cluster nodes or the

pause in the default name space. uh if

the AI application wants to actually

read those resources, not just listing

them, but actually reading information

from our um cluster. Um we need to have

a request handler for reading resources.

One of the um things that are uh

specific to uh resources are the URIs

that we specify to uh uniquely designate

the resources.

We use the K8 protocol and

clusters/nodes

for example for the nodes. And if the

requested URI from the AI application is

this, then we're going to ask the

Kubernetes client to list the nodes and

then uh use the map function to create a

properly formatted nodes um JSON object.

And then we're going to stringify this

nodes object and also put the number of

nodes in JSON string and return it to

the AI application. And in case of

error, we're going to return um the

error message that occurred during this

operation. For uh the resource of

reading pods uh the URI

starts with slash namespace and then

comes the name of the namespace and

slashpods.

If the there is no namespace provided by

the application, the MCP server will

assume that the U AI application needs

to know about the default name space.

Very similar. It just u forwards this

information to the K uh API or the HTTP

or the Kubernetes client list the

namespace part of namespace that was

extracted from the URI. again create a

nice map and then stringify this um

generated map and return it to the AI

application and in case of error just uh

propagate this error back to the AI

application. So I'm just going to run

this for now very simple as you see the

the core logic is done by the um

Kubernetes API client and MCP is just a

wrapper on an existing API. So I'm going

to run this now. And now in our

handlers, we have the

handlers file. Now let's talk about the

tool handlers.

Similar to resources, the application

wants to list the tools that the MCP

server provides. Uh whenever such

request comes in, the MCP server should

have uh the proper handler for it and

should return some metadata around uh

the tools that it supports. Most

importantly is the input schema. So the

MCP server should let the AI application

know if it wants to call a certain tool

what inputs it should provide and this

is the context that is written. So the

MCP server tells the application if you

want to create the pod you need to

provide uh name image and name space at

least and name and image are absolutely

necessary. If you do not provide the

name set I'm going to go ahead and

create it in the default name space. We

have another um tool get pod logs simply

looks at the um logs of a certain pod in

a certain name space. Again the required

input is going to be the name of the pod

delete pod list pod and so on. Very

simple just uh uh some metadata around

the tools that the MCP server supports.

So now if the AI application wants to

actually call the tools that the MCP

server provides, it needs to um send its

request using this call to request

schema. The MCP server can extract the

um incoming input from from the request.

For example, if the create pod is

called, the application then creates a

default pod manifest. You see these u

resource request limits and so on are

hardcoded.

it um gives this manifest generated

manifest to the Kubernetes client and

the Kubernetes client is going to create

a namespace pod and um uh return the

result of that operation as a text to

the AI application. It's either

successful or uh it's a failure. Similar

to creating pods, we have an operation

or a tool for getting pod locks,

deleting pods, listing pods, and so on.

Very simple. You can just go ahead and

look at the logic of the handler for the

tools. I'm going to simply run this

block to add it to our uh handlers file.

So now let's talk about the uh prompt

handlers. Very similar. The AI

application should be able to list the

existing prompts supported by the MCP

server. The MCP server tells the AI

application that it is capable of

troubleshooting pod or optimizing

resources and returns the proper uh

metadata and most importantly it tells

the application that the argu these are

the arguments that are needed for for

this prompt and um if the AI application

actually decide to use those prompts uh

the the request will come in with the uh

get request request schema and u for

example for troubleshooting pod we have

this well-defined um prompt that can be

used to troubleshoot pods. There are

certain places that uh the the uh MCP

server can look into to troubleshoot a

pod. This nice prompt or to-do list um

is something that um can be given to the

application upon request and also for

the optimize resources very similar uh

when the MCP server wants to uh optimize

the resource it needs to look into

certain places and then make some

suggestions

and then now we have this function this

helper function for setting up all of

these handlers. So we still have

handlers for listing and uh reading the

resources, listing tools and calling the

tools and listing and reading the uh

getting the prompts. I'm going to simply

run this block of code to add all the

handlers to our handlers file. So now if

we scroll down on this file at the

bottom, we should be able to have

everything we need for our MCP server.

So we have all the uh reusable blocks

for MCP server. All right. We created

the handlers for our uh different MCP

server primitives, the resources, tools,

and prompts. And we also have a helper

function for setting them up on an MCP

server. So this function gets a server

instance as an input. And we'll go ahead

and set up those uh register those

handlers on that server. Now we want to

create our stdio MCP server first and

test it locally. Uh we need to uh

against import the uh server class from

the MCP SDK and the seddio transport and

our helper function for setting it up

setting up the handlers from our uh case

handlers file. We have a function here

for creating the MCP server. We're going

to create a new instance of the server.

give it a name and version and uh for

now the capabilities are going to be

empty and then once we call the setup

MCP handlers and pass in our server

instance to it uh that function is going

to attach those uh handlers that we

created to this to our server instance

and in our main main function we are

going to create this MCP server uh

instance and create an instance of the

seddio transport and connect server to

the stdio transport and then log the

successful creation of the MCP server or

if there's an error just return write

something to the console and uh exit the

process. So I'm going to create this um

server file now in the servers. As you

see it's in the K uh it's in the service

directory called K MCP server. And now

we need to compile this. I can simply do

npm run build

which will call the TypeScript compiler

for me. And uh the the outcome of this

command will generate the

JavaScript files out of the TypeScript

that we have. Most importantly, we have

this case MCP server JavaScript file.

Now I want to test the stdio MCP server

that I have created. I'm already using

cursor. Cursor is a well-known u AI

application or AI host which can be

configured to work with custom MCP

servers. If you want to let cursor use

your custom MCP server, you need to let

cursor know about your MCP server and

what command is needed to run it and so

on. So you need this a similar type of

configuration for your uh MCP servers.

Uh

if you want to manually navigate to this

configuration file on cursor, you need

to switch to the agents tab and then

create a new chat and in the chat window

you need to uh go to the chat settings

and in the tools and integrations you

can add a custom MCP. I'm just going to

simply copy and paste this thing

read.json

and I let cursor know that I have a MCP

server called Kubernetes which can be

run by uh doing node this file and

accepts this file as a environment

variable and puts it in an environment

variable cube config. Now if I look at

my cursor settings, I see that it

immediately identified my MCP server and

uh it realized that my MCP server is

exposing four tools and two prompts.

Amazing. So let's just uh do some demo

for now. I'm going to make this terminal

a bit bigger and

open up K9S to look at my Kubernetes

cluster. I have created a kind cluster.

My cluster is empty. There's nothing in

it. If I go to the default name space,

no pods running. Uh I'm going to uh test

this by asking AI which is now equipped

with my MCP server. Um what are the pods

or list the pods?

And

when it tries to list the pods, it is

going to look at the uh MCP server. It

is going to call the list pods and it is

going to tell me zero pods found in the

default name space which is exactly

matching the set of our cluster. So now

I want to ask it um create a pod with

HTTPD image. And now um it tries to use

the MCP server. It realizes that there's

a create pod tool in the MCP exposed by

the MCP server. It calls it. It creates

the HTTPD pod and then calls the list

pods and uh tells me that the HTTP pod

is created. Awesome. So everything

pretty straightforward. AI is basically

um interacting with Kubernetes cluster

with the help of our MCP server. Now let

us do something interesting and create a

faulty pod. So create a pod with image

ngx2

which does not exist. So then it will

create a pod again with the help of the

create pod tool but then it faces the

like the pod fails with the error image

pool but the mcb server creates it. Now

if we want to ask um to troubleshoot

the faulty pot for us going to look at

the um state of our cluster

and

now it is going to get the pods list the

pods and troubleshoot the pod for us. It

realized that it is running a a wrong

image. It goes ahead and uh calls the

delete pod tool and then calls the

create pod and basically uh within the

tools that it has at it uh disposal, it

decided to uh go with the delete pod

first and then the create pod. There was

no um tool for setting image or patching

stuff. So uh the AI application tried to

solve the issue by using the existing uh

tools that it has available to it. And

now we see it just basically uh got uh

fixed the issue somehow for us. So that

is basically a demo of how an AI

application can interact with a

Kubernetes cluster with the help of an

MCP server u for the pod API of the core

group of Kubernetes.

So now let's uh move on to the next

part. All right, we created the SCDio

version of our MCP server and now we

want to create the HTTP streamable

version of it. We need a web framework.

We're going to use Express. Uh we need a

random UID generator from the node

crypto package. Uh I'll get to the uh

reason of uh why we need a random UI

generator uh in a second, but for now

let's uh continue. the server uh class

as we saw in the SCIO version from the

MCP SDK, the streamable HTTP server

transport from the um SDK um and the our

handler setup function

from our handlers file and the this type

for checking if the incoming request

from the application is an

initialization request. We create our

express application. We um assign the

port. We either read it from the

environment variable. If not set, we

just go with 3001. We uh enable the JSON

for express application and we create an

empty map of um session ids and

transport. Uh so for we're going to

store all the transport in a map and the

key of those transports are going to be

the session ids and this id is going to

be exactly that unique identifier that

we talked about here that is going to be

randomly generated after a session is

established. So this is how we define

our map and this is how we uh create our

MCP server. again uh a server instance

uh using the server class from the SDK

the name the version we declare the

capabilities as empty for now and then

we call the setup MCP handlers um to

attach those uh primitives to our MCP

server instance.

Then we um create a simple health check

endpoint for MCP server that uh returns

some uh useful JSON uh to whoever calls

this endpoint. This is something we can

use for example in Kubernetes uh as a

health check endpoint if you want to

deploy our MCP server as a pod uh to

Kubernetes

and then the main or most important

endpoint of our HTTP MCP server the is

the /mcp endpoint which uh if we use the

post method like if the client use the

port post method it will be about uh the

client to server communication so the

initiation of the communication

If it's a post method, it's going to be

from the AI application or from the MCP

client to the server.

So here we need to

uh uh read the headers of the request.

If there is an already a session ID

coming along the request header that

will be extracted and um the the server

checks whether the session ID is empty

or not and also check if that for that

session there is already a transport in

our uh map of transports

and we'll uh look up that transport if

such transport exists otherwise it's

going to create a brand new transport

for that session.

Um let's give it an ID using this uh

random UU ID generator and uh also

register event handler. So when the

session is initialized that transport

that was created here will be added to

the um map of transports that we

maintain in our MCP server. And on

transport um unclo like we register also

the unclose uh event handler. If the

session ID attached to the transport is

empty like if the client wants to

actually close the session the transport

should be removed from our map of

transports.

Here we create the server and um we

connect our server to the transport that

we just either looked up or created

and if uh something goes wrong we return

the u message to the application.

Otherwise uh we're going to uh we have

the transport and we have the uh uh

application uh running. Now we have uh

we can let our transport handle the

requests received by our uh express web

application. This was for the um AI to

MCP server communication path. For the

other way around uh which is called

server to client sort of notification or

server side events

we need a get uh method for our MCP

endpoint. How's that going to work?

Imagine uh the server or the MCP server

wants to notify the AI application of

something that happened inside the

server. For example, in case of a

Kubernetes MCP server, some events in

the cluster or some logs showing up

something basically that we are

basically interested in

wants to notify the AI application. The

AI application or the MCP client will

call the get endpoint.

it will establish a session already and

then the MCP server will already use the

existing transport to send messages or

propagate messages from the server to

the client. So this method is used to

look up an existing session for uh

server side events and if there is such

session we look it up and let it handle

the uh request. And here when the um

session is about to be terminated like

the AI application wants to actually

close the session this as you can guess

calls the MCP endpoint with a delete

method and u uh let's extracts the

transport and let that transport handle

that delete request and the transport

actually then will go ahead and uh here

uh delete the transport uh from the map

of transports when uh the delete session

deletion request is being handled on

close event handler. Now here we have

the main function. We start the um

application by listening to the port

that we specified some logs and uh

nothing special. I'm going to go ahead

and create the MCP server. Now we have

the MCP server in here. If I want to

test it, just a quick check if I can

compile it and uh uh run it with node or

import it with node. See that I get u

green check marks.

Uh if if we want to test the endpoints

that we um created for our HTTP MCP

server, we can also run this script.

It's simply going to start the

uh start HTTP

uh script that we have for our node

application and uh use curl to do health

check testing. Very simple. Let's do

this again. We see that the health check

is working. We're uh receiving a uh an

expected JSON.

And that was it with the um MCP server

um manual testing. If you want to

uh test our MCP server in a different

way, Antropic actually has built a very

nice tool called inspector which you can

use to um troubleshoot or inspect your

uh MCP servers.

You can uh simply start inspector by

running this command and then pass as

the argument uh your MCP servers. This

is for seddio for HTTP. You can um run

the inspector command and pass the URL

to your MCP server. So you can just

inspect uh your MCP server simply by

passing the URL. Uh I'm going to go

ahead and uh test my uh stdio and uh

HTTP MCP servers using inspector with

the script. Now to see basically how it

looks like and if we we see that like

inspector managed to uh hit our

uh MCP endpoint and get some results and

uh and so on also help could manage to

hit the uh uh health check endpoints and

so on and everything seems to be working

as expected. Um, we can also configure

our

AI applications uh to use this HTTP

based uh MCP server. Uh, I'm not going

to go over that for now. Now, let's look

at the inspector a bit uh in more

detail. I have some scripts in my

package.json. the the uh inspector and

inspector HTTP if you want to uh uh

start the inspector for your CDIO or

your

HTTP web uh MCP servers. Let me just do

npm run inspector http

and inspector

uh UI starts. It establishes a session.

I see uh I have to choose the transfer

type, the URL of my uh MCP server. And

then if I press connect, I see uh the

initialize

uh method was called by the AI

application which in in in this case is

this inspector environment. And now I

have established session to my MCP

server. And here I can for example list

resources. I see I have cluster nodes

and uh name space pods and so on. I can

list the prompts what prompts I have. I

can list the tools and so on. And you

can also basically click on this and uh

see the metadata about you can get

information about the input schema and

and so on. basically all the details

that you need to work with your MCP

server which is uh a pretty cool tool by

entropic. So if I now close this we can

also also see all my interactions with

the inspector has been logged on the

console. So if I now go back to the

readme uh we can continue with the

different testing setup. If you want to

test your uh MCP servers

programmatically you can definitely do

that. Uh here's a sample test file. It

just uh creates a client. Instead of

manually using cursor or cloud or uh

inspector to connect to your MCP server,

you can programmatically do it and uh

query the tools, the prompts and uh hit

the endpoints and see what kind of

response you can get from your MCP

server. Let me just quickly do it um and

create this test file. I have this test

client. Now if I compile it and run it,

I see my test is actually successful in

terms of calling the MCP server and uh

figuring out what tools and what prompts

it offers. So um we talked about um the

MCP server

and the integration with different AI

application and uh simple ways of

testing it. Next thing would be to

containerize this uh binary that we are

building. We are going to build the the

container image for the HTTP MCP server.

We need a docker file for that in the

end. It's a simple express um node

application in Typescript

uh which I'm going to use this docker

file for containerization of this

service. So I'm just going to run this

script. And now I have the Docker file.

I can also create a darker ignore to be

on the safe side. These are the commands

that I can run to um build the image if

I want. I can also test it again like uh

run the container in a demon mode in a

background mode and uh do the port

forwarding and hit the end points of

health check and MCP and see if I can

get the expected results. So let's just

do that. Let's let's build the container

and start it. Wait for some time and

then hit the end points

and stop the session.

And now we see that we get a green check

mark that um this tests were successful.

Uh this part is interesting. You need to

provide a um valid session ID to call a

certain um method of your MCP server.

This ID one that I uh provided was not

valid. That's why you see in the request

that request no valid session ID

provided for an actual test. Um if

there's already a session established so

that MCPs you can use that ID of that

session to perform this testing.

Uh if you want we can also um push this

container image that uh we built to some

registry.

Now we want to move on to the next part

and make our MCP server a first class

Kubernetes citizen which basically means

uh we're going to uh create a custom

resource definition and an operator that

manages the life cycle of our MCP

server. We created the MCP server and

now we want to truly make it a first

class citizen of Kubernetes by uh

extending the Kubernetes API and

creating a custom resource a custom

resource definition for our MCP servers

and a controller that watches for that

custom resource and handles the

reconciliation events of this

declarative API that we create for it.

For designing the API, we need to uh

first focus on designing the desired

state. uh the container image that we

want to run for our MCP server, the

transport type, some port and networking

configuration, the number of replicas

that we want to have for our MCP server

if we want to scale it up or down and

the the resource requests and limits

that we want to assign to those pods

running um the MCP servers and uh some

MCP server specific configuration. Let's

say some environment variables that we

want to pass to our MCP server. The

status part or the observed state have a

field for uh the current phase of the

MCP server resource pending ready failed

uh depending on the situation

uh that happens during the

reconciliation event. the number of

ready replicas, the endpoint that is

eventually um um callable or reachable

in case of HTTP or HTTP streamable MCP

servers and some conditions that are

basically detailed uh uh information

about the status or the observers and

some generated stuff that are written to

the status of resource by the

controller. I'm going to use the cube

builder project for creating this uh API

for our MCP servers. If you are new to

Cube Builder or Kubernetes operators in

general, please check out the uh Cube

Builder crash course that I've already

created on my YouTube channel. I'll put

the link in the description. Also, um

I'm not going to spend too much time on

the details of this um controller. the

most important stuff that are related to

MCP servers I'm going to go through. So

I'm going to now run this block of code

create my MCP operator

directory. Now we have an uh MCP

operator directory next to our um M MCP

lab which contain all the MCP server

code. Again we need to uh cd into the

right directory. Then we now we have the

project. Now we want to create the API.

The API group is going to be uh MCP. The

version is going to be v1.

So the fully qualified domain name of

our API is going to be mcp.acample.com.

And the kind is going to be MCP server.

we tell uh cube builder to create a

controller for us to watch our custom

resource

and a custom resource uh definition that

we're going to need for our MCP server.

So if I run this now, I should be able

to see that there is an empty uh boiler

plate MCP server API created which does

not have the fields that we want. We're

going to uh implement those next. Now

let's look at the API type that we want

to implement. As I said, we want to have

the image, the transport, the port, the

replica, the config, the uh resource uh

configuration for the pods, some

security context for those pods, the

service accounts, those uh uh pods or

those uh MCP servers

will be assigned and some cube config

secret which is interesting. If our MCP

server is um

supposed to um manage the state of a

cluster that is remote that the cube

config of that remote cluster needs to

be stored in this cube config secret

and the environment variables that our

MCP

uh server needs. The resources is just

the um the the list of MCP server

resources when we want to query uh like

QC cut MCP to get the list of our MCP

resources

and the cube config secret ref which uh

determines the information about the

secret holding the cube config or MCP

server needs the name of the secret the

name space the secret is living in and

the the key of the cube config

in the secret and the status or the

observed state you need the phase as we

said it's going to be an enom of pending

ready and failed or terminating in case

we delete the MCP custom resource the

conditions

uh some details around the observed

state number of ready replicas uh number

of replicas uh total number of MCP

replicas number of ready replicas total

number of replicas plus the endpoint to

the MCP server and some observed

generation which reflects the generation

of the most recently observed uh MCP

server. And the last transition time is

to the last time the condition was

changed like something happened uh to

the to our custom resource for the MCP

server.

You can read the comments. It explains

what that line is about. The phase, the

different phases that we define as a

constant and some string that can be

then uh set to the on the status of

resource and the most important the

condition types. Um depending on how the

reconciliation loop goes during the

reconiliation, our controller is going

to create a deployment a service and a

config map for our MCP server. If all of

this goes well, uh the condition ready

for the the condition ready will be

written to the status sub resource which

basically means that our MCP server

custom resource is ready and our FCP

server is ready and and usable.

Then we have the finalizer.

Uh again uh the finalizer route is

/finalizer.

Uh if you're new to final partner is a

piece of logic that uh handles um

the post deletion uh phase of uh a

custom resource. So if things needs to

be gracefully shut down, if sessions

needs to be closed or if uh resources

needs to be free, the finalizer takes

care of that. The MCP server construct

will have the um spec status, the

desired observed state, the list of it

and some helper methods which is going

to be used by the controller if it wants

to quickly uh read some fields from the

desired state and the helper method for

setting conditions um which we're going

to use in our controller and some helper

method for setting the resources for the

pots depending on um the transport type.

If uh the transport type is a CDIO, it

is for testing purposes. It's not that

uh important. It's it's not going to u

be deployed in into production. We go

with uh minimal uh configuration for

HTTP and streamable HTTP. Um we have

higher requirements and u that's pretty

much it. So I'm going to run this and

update our existing MCP server.

And now we have the MCP server type. Now

let's look at the controller which as we

know uh the most important uh function

in the controller is the reconcile

function. We have all the necessary cube

builder markers on top of our reconcile

function. Our uh controller has full

access to uh anything that has to do

with our API. It can do uh get list

watch create update patch delete on

mcp.acample.com example.com resources

and uh on the status or on the uh on the

status of resource is it has this

permissions and on the finalizer it can

add or remove the finalizers

and it can also as this controller and

it can also manage some um sub resources

and those sub resources are going to be

deployments and services and config maps

and the events of those resources. With

this markers, we just tell cubital what

rolebase access needs to be assigned to

to our controller uh or the controller

manager running our controller. And this

is a reconcile function. When there is a

request for reconciliation,

uh we need to uh the API client needs to

read the MCP resource from the API. If

it's not there, if it's not found, we

just ignore it and log in that it is

probably deleted. If um it is not

deleted um and it is being created, if

the deletion time stop is null and um

the our custom resource does not contain

a finalizer, we add a finalizer to it

and update the

uh our custom resource that we uh with

with the finalizer being added. And if

uh deletion time stamp is not null, it

actually means that we request the

kubernetes API to delete this uh MCP

server and our customers already

contains a finalizer. We need to handle

the deletion. For example, gracefully

shut down the pods and so on. And then

we need to uh remove the finalizer after

the deletion has been handled properly.

And here we have a a function that says

the default resources for MCP server. If

the some um specs are uh if some fields

from the spec are missing, here's a

function that validates the MCP server

in terms of configuration

and um if uh and recuse the

reconciliation requests with some sort

of backoff

and um here we have some helper

functions that we're going to add to the

uh to our controller um logic.

later. Uh which basically reconciles a

config map, a service and a deployment.

Just creates those resources or update

them depending on what has been um given

to us uh via the desired state. So

basically reconcile on those resources

and then the update status uh with the

results of the reconciliations.

And here's the handle deletion function

of our controller which is called by the

finalizer at the time of um deleting the

resource.

Uh it sets the phase of the status to

terminating and set some condition to

explain why it is set to terminating. it

is set to zero because the MC server is

being deleted and updates the status and

then u calls another function that

actually u performs the graceful

shutdown and the graceful shutdown

um depending on transport type applies

different configuration and eventually

we'll scale down the deployment to zero

to um uh shut down the MCP servers for

us. Uh you see the different

configuration that we have for the grace

periods of different um transport types.

And then we have a bunch of helper

functions that are used by our

controller either to uh detect some

misconfiguration and return some from

the reconciliation group with some error

update the status depending on how the

reconciliation goes. Uh it is pretty

self-explanatory. I'm not going to spend

too much time on this. I'm going to

scroll down to the uh most important

function left in this uh code block

which is uh setting up our controller

with the controller manager. So we're

going to tell the controller manager of

Kubernetes that um this controller is

going to watch the MCP server type

resources in the MCP v1 alpha API

group and uh this controller owns uh the

deployments

uh the services and the config maps that

are being managed by this controller. So

this is how we tell the uh manager what

resources we are watching and what

resources we are we can manage with our

controller. I'm going to go ahead and

run this to create the core logic for

the controller. If I now open up my

controller directory, I have the MCP

server controller which has some errors

because some u helper functions for

reconciling the manage resources, the

config map service uh service and

deployment are missing. I'm going to add

them next. So for managing um resources

we have to have a reconciliation

function for deployments for uh services

and for pods uh and for services and for

config maps. I'm going to go ahead and

uh create those by uh scrolling down to

the bottom of the file and add them. So

now we have the reconciliation for

deployments for the services can uh go

ahead and create them that the config

map also uh is has its own reconcile

function. I'm going to create that for

services and config map as well. And now

we see that the

the controller does not have any uh

syntax or compile error. Now it's a good

time to try building and uh making uh

and installing our API on our Kubernetes

cluster. What I'm going to do is that

I'm going to run this block. If I can

generate the manifest.

All right, the build was successful.

If I do now look at the

config CRD bases and here now we see our

custom resource definition that was

generated after I ran the make generate.

You see all all those things that we

defined for our spec and for our status

are in here with all the necessary

validation

uh that was implemented by uh those

cubeolder markers and so on. Now let us

move on to the next part. We're going to

create some sample custom resources in

the config samples directory. If you

find it here, it's going to be here. So

we're going to add the samples for MCP

server custom resource in here. Let's

just uh create the basic one and the

advanced one. The basic one just has the

image transport type, the port replicas

and some config.

And the advanced one is more verbose. It

has configuration

for pods, security context and uh some

environment variable. I'm going to run

that.

And now we have the basic and advanced

custom resources which we can use to

test our operator to install the uh

Kubernetes uh custom resource definition

for our MCP server. We can do make

install and once we do this the CRD is

going to be installed on our Kubernetes

cluster. So we see it is created. So if

we go to get CRD anchor for MCP we see

it is created

mcp servers. MCP.example.com.

Now if I want to run the operator and

see if the

it can reconcile on the

sample custom resources that we have the

controller is up and running and waiting

for reconciliation events. If I open a

new tab, create for example the basic

MCP server, I see some logs going on uh

in the controller.

If I now for example, let me open a new

terminal and here just do K9s

and search for MCP servers. I see I have

my basic MCP server pending and if I

look at the reason it basically mean uh

it is telling me that the deployment is

not ready. So if I look at the

deployment,

I see the basic MCP server is not ready.

And if I look at the replica set behind

this deployment,

I see

still not so much valid. Let me look at

the pods. I see the pod is failing with

error image pool. So um it basically

means our kind cluster has not loaded

our MCP server image yet. I have also

put that in the documentation. If you're

using kind or mini cube uh you need to

make sure that you you uh load your

docker image of your HTTP server. That's

the command for it. So the the image we

build for MCP server is going to be if

you look at the basic the image is going

to be KS MCP server latest. So

so I load the image into my coin cluster

by running this command. KS MCP server

is the name of our image and it is what

is used in the basic custom resource.

I've already deleted the custom

resource. If I create it again,

if I I'm not sure. Okay, I need to be in

the MCP

uh operator directory. If I create the

MCP server resource again and look at

custom resource. Now I see that my basic

MCP server is running. The pod is

running. If I look at the MCP

server

resources, I have the basic MCP server.

If I look at the services, I see I have

my basic MCP server

uh which points to the pot and the

container shows me the logs of the MCP

server that I have running. And I can

look at the status of my custom

resource. If I describe it and if I

scroll down, I see I have for example

the endpoint. So this is the fully

qualified domain name of my MCP server.

And later if I expose this endpoint uh

via a via an ingress or gateway the AI

applications are able to call the MCP

server that is remote to them. All

right. So that's it.

We covered everything.

Once again we made it to the end of the

tutorial. I appreciate your dedication

and your patience for following along

the tutorial with me. As always, if you

like the video, please give it a like or

share it with your friends. Um, and

subscribe to my YouTube channel and uh

also leave your thoughts in the comment

section. Let me know what you think

about the MCP protocol specifically in

the context of Kubernetes. Until next

time, take care.

[Music]

Build Your OWN MCP Server for Kubernetes | AI Integration Crash Course

Shahrooz Aghili

41 days ago

1:18:01

Model Context Protocol (MCP)

Rank #5

Description

Model Context Protocol (MCP) is revolutionizing how AI applications integrate with external systems and data sources. In this comprehensive tutorial, you'll learn how to build production-ready MCP servers and create a Kubernetes operator to manage them at scale. MCP provides a standardized way for LLMs to access context from your systems - think of it as "USB for AI integrations". Instead of building M×N integrations between AI apps and tools, MCP transforms it into an M+N problem with a unified protocol. #kubernetes #mcp #operator #ai #llm #cloudnative #platformengineering #kubebuilder #programming ➡ GitHub: https://github.com/aghilish/k8s-mcp ➡ Follow me on GitHub: https://github.com/aghilish ➡ Follow me on LinkedIn: https://linkedin.com/in/aghilish/ ➡ Kubebuilder Crash Course: https://youtu.be/azJsyLjvHsI (0:00) Intro (01:31) What is Model Context Protocol? (04:04) MCP Architecture & Core Primitives (04:40) Resources, Tools, and Prompts (09:03) Building Your First MCP Server (14:40) Stdio vs HTTP Transport (19:45) Kubernetes Integration (57:10) MCP Server Operator Design (58:40) Creating the Operator with Kubebuilder (01:01:40) Custom Resource Definition (01:05:16) Reconcile Loop Implementation (01:13:18) Deploy and Test the Operatorx (01:17:23) Outro By the end of this tutorial, you'll have: ✅ Built MCP servers that integrate LLMs with Kubernetes ✅ Created a complete operator for managing MCP server deployments ✅ Understood production considerations for AI infrastructure This tutorial bridges the gap between AI tooling and cloud-native infrastructure!

Video Details

Category

Model Context Protocol (MCP)

Featured Date

November 15, 2025

Quality Rank

#5

AI Recommended