Python + MCP: Deploying MCP servers to the cloud | DailyDevLists

Loading video player...

Full Transcript

10,280 words • EN

Hey everyone, thanks for joining us for

the next session of our Python and MCP

from local to production series. My name

is Anna. I'll be your producer for this

session. I'm an event planner for

Reactor joining you from Redmond,

Washington.

Before we start, I do have some quick

housekeeping.

Please take a moment to read our code of

conduct.

We seek to provide a respectful

environment for both our audience and

presenters.

While we absolutely encourage engagement

in the chat, we ask that you please be

mindful of your commentary, remain

professional, and on topic.

Keep an eye on that chat. We'll be

dropping helpful links and checking for

questions for our presenter to answer

live.

Our session is being recorded. It will

be available to view on demand right

here on the Reactor channel.

With that, I'd love to turn it over to

our speaker for today, Pamela. Thanks so

much for joining.

>> Hello. Hello everyone. Welcome to our

Python plus MCP series. This is a

three-part series. This is part two of

the series. If you're just joining today

and you missed yesterday, you can

totally go back and watch the recording

on YouTube of yesterday's talk and catch

up. Everything is recorded and available

for you. All the slides, all the code,

so that you can watch after, rewatch,

share, whatever you want to do. One sec.

So, I see we've got people joining from

all over. Someone said they're joining

from Japan, and I think it's 3:00 a.m.

there. So, kudos to people for joining

from so many different time zones. That

is really fun to just see everybody come

together in order to learn MCP. uh which

is one of the most exciting new

technologies of this year and I think

we're going to hear a lot about it in

2026 as well. So yesterday we talked

about how to build a basic MCB server in

Python using the fast MCB package and

how to run that server locally and use

it locally both from uh an agent like

you have copilot and from programmatic

agents. Today we're going to talk about

how to deploy Python MCB servers to the

cloud. Uh, and of course we're going to

be using Azure as our cloud since we're

from Microsoft, but a lot of these

concepts can apply to other clouds as

well. And then tomorrow we'll talk about

authentication. So definitely join

tomorrow if you're interested in user

authentication.

So let's talk about deploying MCP

servers to the cloud. You can grab these

slides from this URL if you want to have

a copy of the slides and follow along

with them and we'll share that URL in

the chat here.

Uh for those of you who don't know me,

my name is Pamela Fox. I'm a Python

cloud advocate. That means my job is to

figure out how to use Python with

Microsoft technologies. And these days

it's a lot of generative AI and related

technologies like agents, rag, and of

course now MCP.

So in our topic today about deploying,

we're going to talk about how we can

take a fast MCP app, dockerize it, uh

make it into a container with Docker,

deploy that to Azure container apps, um

add some observability using open

telemetry and then exporting to

different observability platforms like

Azure app insights and logfire and then

do private networking with Azure virtual

networks.

All of the code for today is in this

repo here, aka MS Python MCP demos. This

is the our our main repo for this

series. So, if you haven't uh if you

haven't bookmarked this repo yet, please

do. You can fork it, you can star it,

you can clone it. Uh you definitely want

to keep track of it somehow. And uh

everything that we're showing today is

from this repo. Now for today's session

because we are deploying to a cloud, you

can't just follow along locally. Uh you

actually do need an Azure subscription

in order to be able to deploy. So if you

do have an Azure subscription, you can

uh follow the readme instructions for

deploying. And if you don't have an

Azure subscription, uh, that this could

be a good time to get an Azure free

trial. And we'll put a link to the free

trial in the chat as well. And you can

try deploying on the free trial. Now,

sometimes a free trial does have some

limitations. So, if you have any issues

deploying with a free trial, uh, just

let me know and hopefully I can figure

them out for you.

All right. So, let's talk about cloud

deployment of our MCP servers.

So first let me recap the MCP

architecture that we talked about

yesterday, right? So you know we're

building an MCP server. Our MCP server

exposes a bunch of tools. It might also

expose prompts and resources and you

know a few other parts of MCP but most

of you know the heart of the MCP server

is really exposing a bunch of tools that

can then be used by um by other

applications. Right? So that's our MCP

server that we're building here. And

then we have on this side we have uh MCP

clients that live inside MCP hosts.

Right? So in this case this MCP host

could be something like claw desktop. It

could be chat GBT. It could be GitHub

copilot in VS Code. Those are all

applications that act as an MCP host

that use an MCP client in order to

interact with MCP servers. So this

client can connect to the MCP server

using the MCP protocol in order to say,

"Hey, what tools do you have?" Okay,

great. Thank you. Hey, I want to call

this tool. Okay, great. Here's a

response. Right? So, it's just going

back and forth between the MCP client

and MC server in order to use everything

that that MCP server has. Now, our MCP

host could also be a programmatic agent

like we've got agents written in lang

chain and agent framework. I want to put

dantic, you know, whatever your favorite

agent framework is. All of those tend

to, you know, most of them have the

ability to act as MC clients as well so

that they can communicate with servers.

So that's our architecture.

Now when we're running an MCP server, we

have these two options for the

transport, meaning how are we going to

do that communication between the

clients and the servers. So in

yesterday's session, we started off with

stdio standard input output and that's

where we actually ran the server using a

you know like a terminal command. So we

were like uvun mcpbs server.py, right?

And so that would run the server and

then all the communication would

actually happen uh just just over

standard in and standard out, right? And

so that can work pretty well when you're

you know just running servers locally uh

that are using you know a Python script

or a JavaScript package uh in order to

give you some functionality.

But the other option is HTTP

specifically streamable HP. And so this

with this option we actually set up a

server at a particular port right so

yesterday we set up localhost 8000mmcp

so that was our URL and all the

communication between the MCP client and

the MCP server happened over that you

know over HTTP to that locally running

MCP server URL. So that's generally

going to be the more production ready

choice uh because we get all the

benefits of HTTP right we know we know

how to set up HP servers we we know many

ways of protecting HP servers uh and how

to you know uh expose HP servers in in

different ways to other services. So

generally I'm going to be recommending

HTTP as the transport to use when you're

taking your server into production.

So when we do that, how it's actually

going to work when we do the protocol is

that the MCP client will send an HTTP

request to the SLMCP endpoint.

And in this case, it's doing a post

request and saying, hey, I want to call

a tool and this is using JSON RPC. So

JSON RPC is basically just send a bunch

of JSON uh that has all the information

inside it about what method you're

calling on the other server. So it says

okay this is JSON RPC the method we're

calling is tools/call

and here's the parameters that we're

sending to that to that method here. So

we post this uh this JSON to the MCV

server

and then the MCB server responds and

says okay great I got your I got your

method request I have called the tool

here is the response here's the result

and now the client can process it so

this is what's going to actually happen

when we expose this MCP server over HTTP

So when we want to run an Fast MCP

server in production,

uh we're going to we're going to make an

app based off the fastm and then run it

using a production level server. Uh so

let me actually show show the code for

that. So going into this repo, uh we

have this file here deployed MCP. py.

Uh, so I'll point out a few things,

right? So here we import fast MCP.

That's what we're using to make our

server.

Uh, and then down here is where we

create the the fast MCP server instance,

right? So we say, okay, we're making a

fast MCP

MCP server. Uh, we've got some

middleware, which we'll talk about later

for open telemetry. Um, but really it's

just an MCP server with a name. And then

as you go down, you can see we decorate

and say, "Okay, this is going to be a

tool, right?" So, we hang that off the

MCP server, right? The MCP MCP tool. Uh

we've got another tool here, get

expenses data,

and I've got a prompt here. So, all of

that is uh part of this fast MCP app.

And I've also for production, I've added

a custom route. This is my health

endpoint. This is a general best

practice when you're putting servers

into production is to have a health

endpoint. So you can check to see have a

common way of checking to see if the

server is alive and and working. Uh so

you know so we can add this custom route

here. It's not part of the MCB protocol.

Uh but we're you know we can attach it

and say hey when you go to health we're

just going to return back that

everything looks good. All right. So

then the important thing here is where

we say give me the HTTP app for this

fast MCP instance. So what this does it

returns a starlet application. Now if

you haven't heard of Starlet and most

people actually haven't heard of Starlet

it's actually incredibly popular but

most people don't realize they're using

it. So, Starlet is a Python framework

and it's a Python framework for

creating.

[laughter]

Okay, we should donate to Pi. Good. Uh,

[laughter] so Starlet is a is a Python

framework for creating Az applications.

Now, ASZG means async applications,

right? And generally our recommendations

in in this modern age of um you know

networking and generative AI is to use

async uh web frameworks.

Uh so Starlet makes makes it easy for

you make an uh async framework. Now many

people have not heard of starlet but in

fact many people are using starlet

because fast API is the most popular

Python framework these days and pi fast

API is built on top of starlet. So if

you have used fast API then you have

actually used starlet. Uh fast API just

adds a bit of additional functionality

on on top of starlet but all of the

async routes all that functionality

comes from starlet. So similarly fast

mcp is built on top of starlet right so

starlet is this you know lovely async

framework that's uh that's the core of

many of these frameworks that sit on top

right. So, uh, when we're using fast

MPCP, we can specifically say like, hey,

we need we need to get that that Starlet

ASI app back out. So, that's what this

does here. It says, okay, give me the

Starlet application because I need

something I want to run.

So, we get that app and here we're

making it a global variable. Uh so then

once we have that exposed then we can

run that ASI app and there we want to

use a production

uh level uh server and typically with

async apps we're going to use u

uh so we're going to use yuvicorn to run

that app right so here we say okay uh

I'm just going to cd into the servers

folder and say yuicorn

And this is the file. This is the

variable name. So it says look inside

this file, this Python module. Find this

variable.

And then we're going to just run this on

local host just to show you. And we'll

do port 8000 like we did yesterday.

All right. So Unicorn is a, you know, a

server uh that can, you know, run ASI

apps and it's what we're going to use

for production. You can also use it

locally as as well uh when you're

testing out your your servers locally.

So here we can see that now it is

running on 8000. We get a not found but

that's intentional. If we go to /halth

we can see tada it's healthy. It's up.

It's working.

So that is that is how um you know so

we're preparing our app for production.

Right. So, we've got an ASI app. We can

run that ASKI app using Unicorn, which

is a production uh production ready

server. The next step is figuring out

where are we going to actually, you

know, put this this application. Where

are we going to deploy it?

So, we're going to show how we can

deploy on Azure uh because we're from

Microsoft. Uh but there's other options

as well besides Azure, obviously. uh but

hopefully you know a lot of these

concepts apply across the clouds. So for

Azure, we've got this range of options

for where to deploy Python applications,

right? These are all possible places,

all valid places that you could deploy a

MCP server, right? And there's a

spectrum here of how much control and

flexibility you get versus how many

managed features you get, right? And

it's really up to you as a developer to

decide, hey, do I need more control or

do I want to take advantage of some of

these more managed features, right? uh

and then you give up some flexibility

but hey you get the managed features

right so here you know on the maybe the

the the most managed size we have Azure

functions and they can be a great fit

for uh for MCB servers right they're

very good at uh you know scaling up

quickly responding to um you know

variable variable load being able to you

know quickly respond to something so uh

so you know as your functions can be

really good fit there's lots of example

MCP servers written in Azure functions

and the that team has made it um very

easy to bring your MCP servers to

functions. So with functions if you're

going to deploy on functions you know

you need your actual server file like we

just showed you a requirements file.

We're using piprotoml. Uh let me show

our pi project.totml. So pipro dottoml

is a standard way of defining all the

requirements for a python uh module. So

here you can see uh all the dependencies

that we have. The the really important

one is fastmcp. Um but we've got some

other dependencies here as well. So as

long as we have a pi projectl and our

server python file um that's can often

you know tell an environment everything

it needs to know. For functions you also

do need a host.json which describes uh

how the function is going to work. Uh

another option is app service. App

service and functions are um similar.

they they actually share similar

infrastructure and app service will uh

if it sees a pi project.taml Tamil, it

will automatically build your Python

package for you just based off of that

pipro.l. So all you need for app service

is actually the Python file and pipro.l.

Then there's Azure container apps and

that is a great way of deploying

containerized applications using docker

files. So there we're just going to add

on a docker file and the docker file is

going to describe how to set up that

Python package. And that's actually what

we're going to be doing today because

that's my favorite platform for

deploying.

And on the most extreme end of

flexibility, we have Kubernetes. Right?

With Kubernetes, you need a little more

uh infrastructure setup. You're going to

also add in a Docker compose.

You're setting up and you you might need

a little more infra besides that as

well. So that's the range of options.

Today we are going to focus on container

apps and use that as our platform.

So now let's dig into deploying on

container apps.

The first thing we need is to

containerize our fast MCP server.

For that we need a docker file. So here

on the slide I have a simplified version

of a docker file and I'll step through

this and then I'll show you what our

real one looks like. It's a little more

complicated. Um, but this is a Docker

file. It starts off with a base image.

So here we tell it to start with Python

313. So that's the the Python version

that we're using for our um our

environment. Uh then we create a

directory to to hold everything and then

we install the requirements. So we're

using the pi project.totml in order to

install the requirements. We also have a

lock file from UV. So we copy over both

that PI project and the UV.lock file and

we uh make sure we have UV on the system

and then run UV sync and UV sync will

check the lock file and make sure that

everything in that lock file is

available on this system here.

Then we're going to copy the actual code

into the uh into this container here. uh

we're going to expose port 8000 because

that's what we're g running the server

on. And then we actually start up the

app, right? So we're using that uicorn

command again and saying, "Hey, Unicorn,

run this app on 8,000." And so then

whatever is running on 8,000 will get

exposed

um to to you know, anyone who's trying

to access that 8,000 from outside the

container. they'll be able to go to

8,000 and 8,000 will map to the process

running here. So that is a simplified

version of the docker file.

Uh now if we look at the real docker

file,

it is a little more complicated. So this

might be more what yours looks like for

actual production. So first we split it

up into two stages. We've got a build

stage and that's what sets up all the uh

Python package requirements

and then we have the final stage and

that's what copies the code over and

runs it. And in this case I actually

used a shell script, an entry point

script for running it uh because I did

want easy customization of exactly what

module it's going to run because

tomorrow we change it to a different

module, right? So you can you can get a

little bit fancier with your Docker

file. Um this you know I mostly split it

up into two stages. Um both so I could

reduce the image size and uh so that I

could take advantage of Docker's caching

abilities um to cache the dependencies

separate uh better separately from the

code. Uh so so there you go. So we have

a Docker file. You can, you know, use

the simple one or or the more complex

one, but the point is to have a

consistent way of installing those

packages, copying the code over, and

then running that fast MCP.

So now we need to create our

infrastructure on Azure for this

container to run. So we're using Azure

container apps. And uh when you create

when you use Azure container apps uh you

first create a container apps

environment and that's basically a

virtual network that contains um all

your container apps. So you can actually

put multiple container apps inside that

environment

and then that container app needs to run

an image and that image could come from

any registry. It could come from the

Docker registry, a GitHub registry, a

public registry, a private registry. In

this case I'm using Azure container

registry. So I build the image uh upload

it to the container registry and then

the container app pulls from that

container registry. Right? But you could

pull you could pull an image from

anywhere

now. If you wanted to actually try this

out with the with the the repo uh you're

you know you have to check out the repo

and then there are instructions in the

readme for deploying to Azure. Uh we use

the Azure developer CLI in order to make

it really easy to deploy. Uh so you just

run ASD off login and then ASD up and it

will create all this infrastructure and

container for you. And the reason it's

able to do that is because we have an

infraolder and that infraolder has a

bunch of bicep files. Bicep files is

what's known as infrastructure as code.

It's similar to Terraform if you've

worked with Terraform. And these bicep

files just declare everything that we

need, right? So it says okay, we need uh

you know we need cosmos, we need

container apps, we need openai, right?

Everything that we might need is all

declared in this bicep file and you all

you have to do is run a up and it will

create all of that for you.

So I've already run that. So I can show

you over here, right? So when I run that

um it created

created a bunch of resources here in my

Azure account. So you can see the

container registry. Um I've got some app

insights for storing the logs. I've got

the container apps environment. So if I

click on this container apps environment

uh we can see it actually does have a

couple container apps. Uh there's one

just for an off thing that we're showing

tomorrow. So you ignore that right now.

Uh then there's the actual server. So

that's the one that is that has that

Docker file that I just showed. And we

also have a container app that has an

agent in it. uh because I'm going to

show how the agent you could have an

agent in one container that's calling a

server in another container because that

could be a really common architecture is

you know to be to be putting your you

know servers in one container uh and

then your agents in other containers

and then they're all communicating just

inside that environment.

All right. So now all of that is

deployed. So let's actually use it,

right? So if we click on this server,

uh we can see the deployed URL.

And if we go to that URL, we'll see that

not found. But we go to /halth.

Okay, we see it's healthy. All right. So

it is running there. So, what I'm going

to do

is go to mcp.json

and um you know, so this currently just

has all of our local MCB servers.

Now, I'm going to say add server. So, I

click this little button down here, add

server,

http.

And for the URL, I need to put in the

the the container apps URL and then

slashmcp

because the actual MCP server is at that

SLMCP endpoint and we need to tell it

exactly the endpoint of that MCP server.

So we're doing SLMCP

and give it a name and then it just adds

it here. So it's actually quite simple.

You could also just write this yourself.

um you know we just have the URL and the

type.

So then I can start that server. Okay,

it is running

and here when I go to my copilot chat I

can make sure it's enabled. So I can see

it's enabled here and here you can see

it has two tools. So one of them is add

expense and the other one is get

expenses data. Uh so this is one this is

a tool. Yesterday we had this as a

resource. This time I decided to make it

a tool.

All right. So then we're going to say

like, okay, yesterday I bought

a $40 avocado toast with um fried egg on

top using my MX.

All right. So now GitHub copilot is

going to see that we have this MCP

server and hopefully is going to decide

to call a tool from that server

and uh it's uh working on a working on a

plan here. This is GVD5 I'm using right

now. So we get to see it's all its whole

thought process here. It looks like it

did decide to use the MCB server.

Okay. So you can see it says it's going

to run add expense. It gives the name of

that server that we just added. So that

matches and it says it's adding a new

expense to Cosmos DB. So now in

production we are using Cosmos DB to

store um to store the data. Um you know

locally we're using a CSV but obviously

we don't want to use a CSV in

production. So instead we use Cosmos DB.

So you can see here right we get all the

information we add it to the Cosmos

container and and boom then it'll be in

the container. So let's try allow

so it is running that tool now

and now it's oh and it decided to that's

very nice of it. It decided to then call

the other tool in order to verify that

it worked. So you know this is the

beauty of MCP servers is that I gave it

some tools. it decided to both use add

expense and get expense data in order to

double check that it really truly

worked. Um, that's actually the first

time I've seen it do that. Uh, so it

just depends on, you know, the which

agent you're using.

All right, so it looked like it worked.

Um, so now to verify it worked, what I

can do is well, it already verified, but

what I am going to do is go to the

Cosmos DB. Let me make sure this is the

right one. Okay, so here's the Cosmos DB

account. I can go to the data explorer

and look at the um items container and

then uh you know click on one of these.

So here we can say the the last one was

food and it says amount $40 uh avocado

toast with fried eggs on top. So we can

indeed see that it successfully added a

new item to this Cosmos container using

that production server.

>> [snorts]

>> Yes, yesterday I bought a lot of pizza.

Today I'm getting fancier with this

avocado toast.

Uh I saw there were a couple questions.

Uh so one question was about this infra,

right? All the infra that set all this

up, right? Like the the cosmos. So the

question was, oh did you know did a

human write this or you know did I

generate it? Um I wrote it. Uh

[laughter] well I will say one thing I

am using what's called um the AVM as

your verified modules. So the Azure

verified models try to use the best

practices the best security practices by

default. So I do recommend using Azure

verified modules. Uh if you if you know

Terraform they're similar to like

Terraform modules. So they are you know

officially maintained by Microsoft uh

you know employees that have the best

practices baked in and then you just

have to override what's particular to

your setup right so like this cosmos

um I uh did uh add on the which

containers I wanted right so for this

one I said oh okay um I want to add this

container and I'm using category as the

partition ID right Um so so there are

ways of generating bicep. Um my

preference is to use the Azure verified

modules and then to just customize them

with what I need. And also I recommend

just just taking other people's bicep,

right? So if you're looking for bicep,

go to my GitHub repo. You see all of

these things that have a check mark in

this AZD column. That means they all

have lovingly handwritten bicep in it.

And uh I recommend just taking bicep

that already works. That's that's what I

do. I just I learn from my past self and

I take from my past self.

Um the other question was why are we you

know once again why are we using HTTP

instead of standard input output right

because it's true if you look at you

know if you look at a lot of the MCP

servers that are in the registries a lot

of them are standard input output and I

think there's two I think there's a few

reasons for that one is that at the

beginning standard input output was the

only way of writing MCB servers. So

that's what most that's what everybody

used because that was the only way to do

it, right? That was our first option.

Then they added um SSE which was an HTTP

uh way of doing it but it was kind of

hard to do. And then they added

streamable HTTP. So streamable HP is

actually the most recent transport

option. So that's why you know uh people

maybe slower to adopt it. Uh the second

reason is that you know if you if if

it's a local script then it doesn't cost

people any money if they put out an MC

server that's just a a you know local

standard input output right like it's oh

okay you're just going to download the

package right that doesn't cost them any

money when you're doing a deployed

endpoint that does cost money right

because people are actually sending

requests to your server so there is a

cost involved there and so it has to be

worth it to you to actually um want to

do a deployed server

Um so so there are reasons for it. There

are ways that you can basically turn if

you have an SDIO, you can turn it into a

um into an HP server really easily.

There's all these packages that do that.

I can't remember the name right now, but

I could find it during office hours. Um

so it should in theory be quite easy to

to switch between servers. So if there's

one that's you know only available only

over a particular transport and you need

it in a different way you can just put

you know basically put a wrapper in

front of it. Um but it's a great point.

So you know you might decide to use

stddio instead of HP but I think that

for production that HTP is actually the

nicer option.

Uh and then the question was how did I

set the destination to Cosmos DB? So you

can look at deployed MCP. py. Um there's

quite a lot of setup in here that I kind

of skipped over, right? So I'm setting

all these environment variables on the

container. So if you look at my um you

know, if you look at my container

variables, uh you'll see that there's

quite a lot of settings on it. So you

know, there's um the Cosmos DB account

here, right? So all these environment

variables are set on the container. Um,

and I set those in the actual bicep. So

the the bicep is it sets it up so that

they're on the container once it

deploys. And then I can make a

connection to that Cosmos DB container.

And then inside the actual tools I can

create items. And then in this one I

query the items from the container.

All right. Okay. So let's let's keep

going. Uh we covered a few questions

there. So we saw how we could use the

deployed server from VS Code. Uh now how

can we use that from our actual you know

programmatic AI agents, right? So um you

know yesterday we showed that we've got

some examples using uh agent framework

packages. So, we have the Microsoft

agent framework example here, which sets

up an agent that has access to the MCB

server based off a URL. And then we have

a really similar link chain example as

well. Now, for both of these, we can

customize them really simply to point at

our deployed server by simply changing

the MCP server URL. And I've actually

just already changed that in my uh EMB

file um as part of my deployment

process. So, if I just run this uh this

agent today,

it it should just work, right? So, let

me um close this this one. All right. Uh

so, I'm going to just say UV run agents

agent framework.

Uh oh, I'm in the folder. Let me just

move up a folder. Okay. So, uv run

agents agent framework hp. All right.

So, it should run against our deployed

MCP server URL because I've set that in

my EMV here and it's loading in that EMV

and uh and yeah. So, and then we could

even let me see if I can look at the

logs right now and we might be able to

see the logs happening as it gets

queried. Let's see.

Listening. Okay.

Uh, okay. Here we go. Oh, we missed it.

We missed it in the logs. That's what

happens if you do historical, but we're

going to show open telemetry very soon.

Oh, here we go. Okay, so here we go. Um,

so we can see a log there. And uh, and

we ran the agent. And the cool thing,

the interesting thing is that this agent

actually messes up the first time it

tries to call the tool. It it picks the

wrong value for for some of it and gets

an error. Uh but then it corrects itself

because this this is an agent, right? An

agent is able to call tools in a loop.

So it makes the first tool call,

realizes that it didn't quite use the

right arguments, and then corrects

itself and makes the second the second

tool call here. Uh so it says that it

logged this 1200 um gadget. So now we

could check Cosmos DB again. Let me do a

a refresh

over here

of Cosmos.

Where is Oh, there it is. Okay. Refresh.

Refresh. Refresh. Refresh.

Uh, I'll just go and click on that.

Okay. Load more.

Wait, I'm gonna go here. Just want it to

refresh. Okay. [laughter]

All right. Now I see it. Okay. So, here

is the new this is the laptop that it

just um that it just bought.

So, uh so there we've run it. So, this

is we're this is running the agent

locally

interacting with that MCP server URL.

Now, we could also run the agent from

the container. So in our agents folder

we also have a docker file and this

docker file you can see very simply just

calls the agent framework file right and

so it should just immediately run this.

So that's actually what it already did

when it deployed um earlier is that it

it ran it ran that uh so we see earlier

actually when it deployed it also ran

that purchase. So you can do you know

you can do you know once you have a

deployed server you can use it from

various places right so you might be

using it uh from internal agents maybe

that are running based off of web hooks

or cron jobs or something like that and

they could be running inside other

containers uh like in this situation

here you could run it from uh a desktop

application like VS code or cloud

desktop you could run it from a Python

script like many many different options

for how you could uh interact with that

deployed MCP server.

So now let's talk about observability

um and how we can actually see what our

MSP servers are doing. So we are using

open telemetry for our observability.

Open telemetry is a standard that says

how applications should emit this

observability data. So it says how they

emit traces, metrics and logs. Those are

the three main types of data that gets

uh exported with open telemetry. So a

trace uh a trace is composed of a of

these spans and it basically shows like

the timeline of what's happening when a

request is being processed by a surface

uh a service. So it says like oh okay

you know we got this tool call request

and then we queried Cosmos and then uh

we got this response back from Cosmos

right so all of that would be a trace of

spans um that that show you know what's

what's happening and all the

dependencies and that's what I find

often the most useful uh then there are

metrics and those are numeric me uh

measurements that can be helpful to

understand how the system is doing

something like you know what's your CPU

usage what's your latency that sort of

And then finally, there are logs. So

when we're using Python and we're doing

like logger.info, logger.

Logger.warning, those are all logs and

we can uh export that in a way that's

compatible with open telemetry as well.

So those are the three main kinds of

observability data that we want out of

our applications.

So for an MCP server, what do we want to

see traces from? So remember our MCP

server is in fact a starlet as

application that's processing route

requests like to slashmcp and

slashhalth. So we want to see every

request that goes to that application.

Then that app you know is is using

fastmcb to process tools resource and

prompts. So we want to know hey did you

get a tool call? Did you get a resource

call? Did you go to prompt call? Right?

We want to know specifically what part

of our server is interacting with. And

then our tools, they happen to call the

Azure SDK, right? Because we're using

Cosmos DB here. So we'd also like to see

like, hey, for Cosmos DB, like what's

going on with those calls, right? Um, so

that's that's the, you know, kind of the

the flow of what we want to see in our

traces.

Now for each of these kinds of uh traces

we need to have instrumentation for that

that knows how to observe what's going

on with that layer in the stack and then

export open telemetry traces. So uh at

the top level for starlet there is a

package called open telemetry

instrumentation starlet. So we install

that that's in our pi project.l

And then inside the application we call

uh you know we call that inventor and

say hey wrap this app in this

instrumentation and when we do that

that's going to output all these traces

for every single route request right and

these are all following a standard of

across open telemetry that says hey when

you get a route request it's going to be

hp.oute route.

Next, we want to get traces from

fastmcp. So, for this, what we did is we

actually wrote our own middleware and uh

we just attached it to our fastm

instance and we say, "Hey, every time

you get a request, you're going to call

this middleware." So, this middleware is

actually in uh you know, it's actually

in the the repo. So, you can check it

out and see uh see how we wrote it. And

basically every time it gets a request

to do something it outputs a span. And

uh so here you can see the kind of

things that are in the spams. It says oh

hey here's the tool call I got. Here's

the arguments that I sent. And these are

also all following a standard um

conventions [clears throat] for how MCB

servers should trace their tool calls.

Finally we want to store the traces to

Cosmos uh DB. Now, when we use the Azure

SDK, by default, it it it likes to

export to open telemetry. Um, we just we

do set this just to make sure that it's

using open telemetry, but as long as we

do that, then we're going to see traces

from all those Cosmos requests as well.

So, now we we're sending off these we're

exporting these traces, but where are

they going to go? We need some place to

actually view all these traces uh once

we have them all exported. So, uh, for

Microsoft, we have Azure Application

Insights, and that's all a managed

hosted, uh, application. It's in the

portal, and we're going to be using

that. Uh, there's lots of other options

for observability platforms. Another one

that I really like is Logfire. It's very

popular with the Python community. I'm

also going to show you that because

that's quite fun to set up as well. Uh,

so you've got you've got a ton of

options. Uh, lots of things that are

open telemetry compliant. So once you've

got everything instrumented, you know,

you can use the open telemetry platform

of choice in order to see those traces.

So in order to export to Azure app

insights, we're going to install another

package Azure monitor open telemetry and

then use this function configure Azure

monitor and that just looks at this

application insights environment

variable and just sets up everything

based off that environment variable. So

this is the easiest way to export to App

Insights. You can do a whole lot more

manual work if you need to, but if this

works, you should use it because it's a

one line and we'll set everything up for

you.

So now what does it actually look like?

Right. So we'll go to um let me show you

the the log. So here in app insights

I've already opened up a trace so you

can see and um and this shows you like

okay we got a request uh we got a post

to slashmcp

and that post uh to MCP tried to call

the add expense tool here and we can see

all the properties and we can see the

first time it did actually fail because

our agent called it in the wrong way so

it got back this validation error and

that's interesting that means like maybe

my agent um you you know, maybe I need

to give my agent more instructions to to

be better. But then it does recover and

then it calls add expense again and this

time it passes in the right arguments

and so it's able to successfully go

through it calls container uh you know

Cosmos container makes a bunch of

requests to my Cosmos account and puts

it in there. So that is a you know a

full trace that shows all these spans

that come from the different uh

telemetry exporters that we have.

Uh now we also have a dashboard. So I

set up this dashboard based off of the

you know logs we have. So then we can

just like kind of monitor this dashboard

and be like oh okay here's our failures,

here's the performance. Uh I I made a

bunch of custom uh dashboard charts here

just using uh KQL custol which is like a

query language for your logs right uh so

you can set all this up with app

insights and and this is all actually

it's all in the bicep so if you deploy

from the repo all of this will be set up

for you automatically

so that is app insights

now I did also want to show logfire

because I think it's a really cool

platform form. So with logfire, we're

going to add the package for logfire and

then it also has a configure function to

call and we're going to set this logfire

token in our environment variable. So

it's able to use that in order to send

the requests. And then when I do that

then I get this I can use the the very

pretty uh logfire dashboard here. And um

so here I am going to focus on earlier

today when I called my logfire agent.

And we can see here

uh let me make that a little bit bigger.

Okay.

Uh so here we get the you know we get

the request from starllet and then we

see this add expense tool call and

that's the one that has the validation

error and then and then it was

successful and then it called uh you

know then it called the cosmos db stuff

uh so it's very similar to app insights

um but you know all these observability

platforms they they have their own way

of approaching things um but you know

what I wanted to show is like okay as

long as we use open telemetry tree then

it's becomes much easier to view our

observability data in any platform that

is compliant with open telemetry. So

that is what I would recommend is um is

to use open telemetry.

All right. Okay. So that was

observability. We have 13 minutes left

and I do want to talk about private

networking. Uh, so I'm going to go into

private networking now. Uh, if there are

questions that I've missed, we do have a

full hour for office hours after this

live stream. So, we can talk about more

things in that office hours as well for

anything um, you know, any questions

that aren't aren't getting answered in

the chat. So, now let's talk about

private networking because here's the

thing. Everything I've been showing you

so far was with a publicly available

endpoint. So, in fact, any of you could

have taken my URL and logged expenses

with it, right? You know, with your $300

of pizza or whatever, right? Like, as

long as you had that URL, you actually

could have logged an expense. And that's

that's kind of kind of sketchy, right?

And some like sometimes people rely on,

you know, subsecure security from

obscurity, like basically like, oh, you

don't know my URL, so it's fine. Like,

it's it's not going to happen. But it

it's very risky to have a public

endpoint uh if you don't mean for it to

be a public endpoint. Right? So this is

a stat we had where 84% of the attack

paths and security issues are because of

internet exposure. Right? So if you have

a publicly available endpoint, you

really

you really should think carefully,

right? Like you really need to want to

have that public endpoint and otherwise

you need to protect that public

endpoint. You can either make it

private, which is what we're going to

show today, or you can add an

authentication, which is what we're

going to show tomorrow. So today, I'm

going to show how can we make that

endpoint be a private endpoint and not

have the risk of this public URL that

anybody could use.

So does a private private endpoint make

sense for you? Right, a private MC

endpoint. So, there's two situations

where you might consider having your MCP

server at a private endpoint. The first

one is if you're making an MCP server

for internal employees at your company

and those employees are already working

on a virtual network and they have like

a VPN or some sort of way of working

inside that network. maybe they're

actually on premise and and that's how

they're inside the network but some way

of getting into that network uh that is

you know particular for your employees

right so then it could make a lot of

sense to uh you know put that MCP server

on that same you know accessible via

that same virtual network so that all

the internal employees can use it and

nobody outside the company can access it

right

that's the first situation the second

situation is maybe your MCP server isn't

even designed for use by humans, right?

Maybe your NSP server is only designed

for use by other agents and those agents

are all internal on your systems, right?

Like if you have these agents that are

just running in cron jobs that are just

like doing this, you know, background

analytics, right? So in that case, it

doesn't make sense to expose your MCP

server to the whole world, right?

Instead, you set up a virtual network

that has your MCP server, it has your

agent, and they just communicate over

internal networking um with each other,

and nobody, you know, nobody needs to be

able to actually go inside that network.

So the way that we're going to set up

that situation on Azure is that we've

already got this container apps

environment and um and that actually by

default that actually is a virtual

network but I'm setting up a new virtual

network um so that I can add some more

things to it. So what I have is this

this whole thing is a virtual network

and inside that virtual network I've got

two subnets and uh the first subnet is

the one that contains the container app

environment with its apps and I've got

what's called an NSG which is a network

security group. It's it's you can attach

it to a subnet in order to um in order

to to come up with rules and say like oh

only you can only allow traffic over

HTTP only allow traffic at these ports.

So, it's just a way of adding more

security to your subnet in terms of what

can go in and out of the subnet. So,

we've got two subnets with NSGS that are

all inside the same virtual network. And

then everything communicates over using

what's called a private link. So, this

MCP server uses a private link in order

to communicate with the private endpoint

of Cosmos DB. So, there's also then

something called a private DNS zone that

resolves that private endpoint in the in

the virtual network. So there's like

there's a lot of infrastructure uh that

goes into this. Once again, I've already

written that infrastructure for you. So

it's all in the bicep. So you can just

deploy it and it'll get all set up. Uh

so everything is communicating over

these private links to private endpoints

using the private DNS zones. So that is

the architecture that we're setting up.

Uh if you do want to try it out in the

repo, you can follow the instructions in

the readme for deploying with private

networking. You basically just need to

set some environment v some acd

environment variables and run up and

it'll set everything up. Uh so I can

show you what it looks like once we uh

once we have everything set up in the

portal. There's quite a lot of

infrastructure involved, a lot of

resources involved with private

networking. Um but it does make

everything more secure once you have it

all set up. So here this is my private

resource group. Uh so this one you can

see we've got private endpoints. Uh

we've got uh okay a bunch of let me sort

by type. Okay. Uh so we've got network

interfaces, network security groups.

Uh we've got all these private DNS zones

and then we've got all these private

endpoints. And then we have the virtual

network itself.

And sometimes the topology graph is cute

to look at and sometimes it's kind of

crazy, but here is the topology of the

virtual network. I think my slide made

it a little clearer. Uh, but this is

just to show there is a lot involved

with a virtual network. Um, and it can

be a little bit of a pain to set up at

first. Uh, but hopefully I've made that

easy for you by writing the writing the

bicep.

So now we have that deployed. Um, so

I'll go back and actually show you uh

how I know that it's all private. I can

go to the I'm going to go and open the

container apps environment

and look at the networking. And here you

can see that public network asex is

disabled. So it is going to block all

the incoming traffic from the public

internet. No one can get into this

environment at all. And we can see it is

associated with a virtual network with

this specific subnet and that it itself

has a private endpoint. So all of this

indicates that it is all completely

completely private, right? Uh so that's

so that's all that's all private now.

And then we can see here we can uh you

know look at our apps and we see that

we've got the agent app and we have the

server app. Uh so if we go to the agent

application,

we can check the logs and uh and see

what it's doing. Uh so we can see it's

it's working here. We can look at the

historical logs and somewhere in the

historical logs, we'll see a call to the

server um where it actually made that

request right?

Uh so that's the general idea. Uh and we

we get it all set up and then the agent

calls the server. Now, this one, the

agent only called the server once when

the server first starts up. In reality,

if you had this sort of setup, you would

either have a cron job, which was

running the agent on a schedule and

doing something, you know, like every

hour or every day, or you would have

some sort of hookbased system where

like, oh, every time that there's a new

um GitHub release, you're going to do

this, or every time there's something

added to the blob, you're going to do

this, right? So, generally, it's either

going to be a scheduling thing or a

triggerbased thing.

Okay. Uh, so, so that's it. So, it's the

demo for it isn't very exciting because

basically the point is, yay, we can't

access it. Yay, it's secure. Um, but the

exciting thing is the fact that it, you

know, it is actually all set up and and

can't can't be accessed.

Now, we have another example that does

set up a kind of similar uh

infrastructure. This is our AI travel

agents uh sample repo and this one sets

up actually MCP servers in different

languages. So it sets up like a

JavaScript one, a Python one, a Java one

and that's kind of a interesting

interesting idea is having uh you know

multiple languages for services and this

is something you can do easily once

you're using something like container

apps or you're just like oh we have a

Python container, we have a Java

container, we have a JavaScript

container, right? So that might be an

interesting sample to check out as well.

All right. So we focused a lot on

container apps today. Uh there are other

deployment options which I mentioned in

the beginning. Uh so just to talk

briefly about these other options. So

Azure functions they do make it very

easy to bring MCP servers. They have two

different ways of doing it. uh you can

either use the Azure functions MCP

bindings and so there you're actually

writing your function code very

particularly for Azure functions using

the Azure functions you know like

package uh and and that's good because

it's very built into Azure functions so

it's going to work really well however

they also did add support for being able

to bring a fast MCB server and just

using that for Azure functions and so

that might be more exciting to you if

you want to be able to like deployed to

multiple places like maybe you want on

Azure functions and also container apps

and also fast cloud right so Azure

functions does support both of those

options

uh I found this example from Anthony Chu

on the functions team which is pretty

fun which deploys a Azure function MCP

server uh for weather but the thing

that's really fun is that it also uses

the openi apps SDK so it dep displays a

widget when inside chat GBT when you use

it. Uh so this is getting into MCP apps.

The idea that MCB servers can produce

not just data but also UI widgets like

actually full interactive applications.

So if you're interested in that do check

out Anony's repo. It's pretty fun.

Now I also mentioned Azure Kubernetes

Service. So with this we're it's similar

to container apps except we're also need

to set up the entire like multiple

services right. Uh so we can use docker

compose.yaml. So, you know, we'd say

like, okay, we have our MCP service

here. We have our agent service here.

Here's the Docker file for each of them.

This is how we're going to connect them

together. You know, the port they're

communicate over. Uh, and then put all

of that on on Kubernetes and you know,

then you want to manage all the scaling

for Kubernetes. Uh, we do have an

example that uses Kubernetes. This is

the the Zaba shop and uh it's like a

pretend retail shop and it uses MCP

servers for uh a lot of the

functionality and then has agents that

you know do analysis based off those MC

servers. Right? So there's one that

looks at finances and tries to get

insights based off the finances and

gives you a report back. Uh so if you're

interested in Kubernetes, check out that

repo there. It's got all the

infrastructure for it.

Okay, and that brings me to the final

minute. We have just made it and I know

there's lots of questions that we didn't

get to, but that is why we have office

hours right after. So basically once we

close down this stream I'm going to hop

into Discord and start up the stage in

Discord and uh and you know we that so

if you have a question that didn't get

answered here just copy and paste it

into Discord and uh we'll try to answer

it there. Uh as a reminder all of these

sessions are being recorded and all the

slides and code is available. So you can

go to this aka this resources link here

in order to get all the resources and

we'll keep that updated. Uh and then

after the series if you want to keep

learning we do have uh my colleagues

have this MCV for beginners repo that uh

you know has a bunch of tutorials about

how to write MCP servers. [snorts] Uh so

there we go.

We can we're going to close up the

stream for today and go over to Discord.

Uh thank you so much for joining today

and for uh for all the great questions

and comments in the chat even though I

didn't get to to all the questions uh

but hopefully we can get to them in the

discord. I hope you join us tomorrow for

talking about authentication

which is super super interesting with

MCP how we can make MCP servers where

we're actually doing things on behalf of

the user. Uh really really some cool

stuff there. Uh, so hope to see you in

the Discord or tomorrow. Bye everyone.

Thank you all for joining and thanks

again to our speakers.

This session is part of a series. To

register for future shows and watch past

episodes on demand, you can follow the

link on the screen or in the chat.

We're always looking to improve our

sessions and your experience. If you

have any feedback for us, we would love

to hear what you have to say. You can

find that link on the screen or in the

chat. And we'll see you at the next one.

All

>> [music]

>> right. [music]

D

>> [music]

>> Dick

[music]

dick.

>> [music]

Python + MCP: Deploying MCP servers to the cloud

Microsoft Reactor

75 days ago

1:02:42

Model Context Protocol (MCP)

Rank #1

Description

In our second session of the Python + MCP series, we're deploying MCP servers to the cloud! We'll walk through the process of containerizing a FastMCP server with Docker and deploying to Azure Container Apps, and also demonstrate a FastMCP server running directly on Azure Functions. Then we'll explore private networking options for MCP servers, using virtual networks that restrict external access to internal MCP tools and agents. 🔎 Explore the repo - https://aka.ms/PythonMCP/repo 📌 This session is a part of a series, learn more here: https://aka.ms/Python/MCP/25 #MicrosoftReactor #learnconnectbuild [eventID:26543]

Watch on YouTube

Video Details

Category

Model Context Protocol (MCP)

Featured Date

December 17, 2025

Quality Rank

#1

AI Recommended