Loading video player...
We've been covering a lot of open
telemetry content lately, covering how
open telemetry works, basically taking a
look at log collection and the concepts
around open telemetry. And then we
focused on tracing. But in this video,
we're going to be taking a look at
metrics. So, open telemetry is a very
powerful framework. It allows us to not
only instrument our applications, but
also generate, collect, and export
telemetry data. And this telemetry data
covers logs, metrics, and traces. Now,
if you've stumbled upon this video and
you're totally new to open telemetry, I
highly recommend you check out my how
open telemetry works video. We discuss
the framework and what open telemetry
actually does. We'll be touching on some
of those key points as well, but in this
video, I want to focus primarily on
metrics. A lot of people are confused
because they ask, why not just use
Prometheus? or they might ask why do we
want to replace Prometheus or if that's
even the case. So in this video we're
going to clarify things take a look at
what role open telemetry plays in the
metric space. Prometheus is a powerful
metric storage. Although it facilitates
the scraping and collection of metrics,
it's more powerful at storing and
querying these metrics. Open telemetry
on the other hand is not really
concerned with the storing of metrics.
It's also not concerned about the
storing of logs or traces. Open
telemetry is the framework that allows
us to connect all the dots together. Go
fetch the metrics, the traces or the
logs, process them, reduce all the
noise, enrich the telemetry, and then
export it to a storage like Prometheus.
If you're interested, I've also covered
a lot of Prometheus content on this
channel, including a video on how
Prometheus works. So, in this video,
I'll also show you how you can deal with
Prometheus metrics using open telemetry.
We've got a bunch to cover. So, sit back
and without further ado, let's go.
So, the first thing we'll want to do is
go ahead and get an open telemetry
collector up and running. Now the
easiest way to do this is to use docker
as you can run an open telemetry
collector in a docker container. Now
using something like docker compose is
the easiest way to run containers. So
here I have a compose file. I run an
otel collector. I give my container a
name. This is the docker image I'm going
to be using for the open telemetry
collector. And I have two volume mounts.
The second volume mount is not that
important. That is just data that I'm
collecting. I just have a hidden folder
called data which I like to use for
testing. That is for me to be able to
test whether my collector is
successfully receiving metrics, logs and
traces as I can dump them to file and
inspect them and this helps with
debugging and troubleshooting. The first
volume is the most important one that is
the open telemetry configuration. And
when working with open telemetry as a
DevOps SR or platform engineer, you'll
want to learn the open telemetry
configuration. And that's something
we've learned throughout this series.
And learning the structure of the open
telemetry configuration is important not
only for running open telemetry in
Docker locally, but for running it in
pods in Kubernetes as well. In
Kubernetes, we use something called the
open telemetry operator, which allows us
to create collectors in Kubernetes
clusters. And when we do so, we'll also
need to write open telemetry
configuration files. For more about the
open telemetry operator, check out my
video on the open telemetry operator for
Kubernetes. Now, in our open telemetry
video, we've talked about the open
telemetry basics. The first step is
setting up collection. We can collect
traces, metrics, and logs. And for this,
we need to install a collector. In our
case, we'll use Docker Compose. The
second step is we can do things with the
metrics. We can process, pass, and
enrich it. We can also drop and filter
out any noise. And then the third step
is to export the metrics. This is where
we can send it off to a Prometheus
instance. All these three steps applies
to logs, metrics, and traces. So, it's
not limited to Prometheus metrics. The
biggest hurdle and challenge for
engineers is to learn the open telemetry
configuration, but it's actually quite
simple once you understand the
terminologies because the documentation
is quite rich. There's a lot of
documentations explaining all of the
different terms. To do step one, which
is collecting metrics, logs, and traces,
we need what's called a receiver.
Receivers allows us to fetch things. So,
in our config file, we'll learn how to
set up a receiver. Receivers can use a
pull or a push model. So, we can either
set up a receiver to go out and scrape
metrics or we can set up a receiver to
receive metrics. Applications can push
metrics to our collector. Once we've
received our metrics, we have this thing
called processes. processes allow us to
perform the second step which is to do
something with the metrics. We can
either drop it if it's noise, we can
enrich it. Think when you're running in
Kubernetes, you want to enrich the
metrics with pod names, container names,
node names, service names, IP addresses,
that sort of thing. It also allows you
to batch up metrics and filter out
noise. Once you have step one done,
which is the receiver, and you have step
two, which is setting up the processor,
you can finally do step three, which is
to write an exporter. that is what to do
with the metric. Do we want to write it
to file? Do we want to send it to a
Prometheus instance? We'll go ahead and
write an exporter. And then the last
step is putting it all together. And
that is a service pipeline. A service
pipeline is basically the glue of the
configuration. We create a metric
pipeline. We say what receiver to
enable, what processor we want to use,
and what exporter we want to use. So the
service pipeline basically goes ahead
and glues everything up together. and
I'll show you a configuration file in
action. Now, it is important to know
that open telemetry doesn't go ahead and
replace Prometheus. Prometheus is a very
powerful metric storage, but open
telemetry can help us go ahead and fetch
these metrics, process it, and forward
it onto Prometheus. We know Prometheus
uses a pool model quite heavily. Open
telemetry supports the pool as well as
the push model. So I can set up my open
telemetry collector to go ahead and
scrape Prometheus instances or
Prometheus enabled applications or the
applications can send metrics to me. To
demonstrate this, I have in my docker
compos file a videos API. I have a bunch
of microservices that makes a video
catalog, a playlist API, a playlist
database, a videos API, and I have a
videos database as well. And in this
environment variable section, you can
see I have a bunch of environment
variables for open telemetry. I use this
micros service in my tracing demo as
well. But the important part here is
that you can see that we have this hotel
exporter
endpoint. This is the open telemetry
endpoint of our collector. And this
tells my net application where to send
metrics and traces. If we scroll down, I
have the net autometrics instrumentation
enabled set to true. I also have traces
enabled set to true. So in my example
client, I'm using the open telemetry
zero code auto instrumentation. This is
supported for Go, Node.js, Python and
Net and more. So in this example, my
microser will push its metrics and
traces to the open telemetry collector.
The first configuration option we're
going to be taking a look at is called
receivers. In our open telemetry config,
we want to enable a receiver and have an
endpoint ready to go where traces and
metrics can come in. On Prometheus being
a pull and push model, we can also have
a receiver that can go and fetch metrics
as well. In my volume mount for the open
telemetry collector, I have this
config.yaml. And this is the open
telemetry config yaml that we'll be
taking a look at. And here you can see I
have a section for receivers. This is
very simple. We're using the standard
OTLP. This is the open telemetry
protocol for receiving metrics, logs,
and traces. We have a gRPC endpoint
enabled as well as an HTTP endpoint
enabled. So what this config does is it
simply defines a receiver. We don't
actually enable or use it yet. Remember
the service pipeline is where everything
gets enabled. So firstly, we just go
ahead and define our receiver. Now that
we have a receiver, once we enable it,
metrics can come in over the gRPC
endpoint. But what do we do with those
metrics? This is where processor comes
in. So next up, we'll define a
processor. Processor basically defines
what we want to do with the data. I just
have a default one in here which is
called memory limiter. This allows us to
ensure that our collector limits itself
to a 90% memory usage. So prevents my
collector from running out of memory.
And then I have a batch one which allows
me to batch up telemetry before sending
it out externally. You could also set up
filters where you can drop things like
health probes as you don't want to
generate metrics on health probe
endpoints. And now that we've defined
our receivers and processes next up,
what do we do with the metrics? So in
this case, we have exporters. So I go
ahead and collapse the receivers and
processes. Let's expand the exporter
section. And this is basically defining
what we want to do with data coming in.
I have a bunch of exporters. In my
tracing video, I've covered a few of
these. I have a file exporter that
writes my traces out to a file and that
is the data volume I showed earlier
where I can basically diagnose the
telemetry coming in. I do the same here
for metrics. So I use the file exporter
and I write all the metrics to an output
log file. So I can go ahead and
troubleshoot to see if our metrics are
coming through. The first part of the
exporter is the type. This is a file
exporter. And the forward slash is
followed by a name. So we can give our
exporter a name. You can put any name
after the forward slash. I just call it
metric so that I can identify the two.
This one is for traces. This one is for
metrics. We also used an OTLP exporter
to send our traces to a tempo database.
So this one is specifically used for
tracing. Now there are two options we
can do for Prometheus. As Prometheus is
both a pull and a push protocol, it is
up to you how you want to architect
this. So you can either use option A
which is Prometheus to scrape the
metrics from open telemetry. So your
Prometheus database will talk to this
collector. So you can enable the
Prometheus exporter and then provide an
endpoint on this port. This means that
this collector will have a metrics
endpoint enabled for all the metrics
coming in. So you can use the Prometheus
pool model and Prometheus database can
pull the metrics from your collector
directly. So you can use the pull
mechanism in Prometheus. In this
example, I'm going to show you the
remote write exporter where I can
basically just have a Prometheus remote
write endpoint. So my open telemetry
will send the metrics onto a Prometheus
database. And this is how we define that
exporter. If we use option B, we have to
ensure that our Prometheus instance is
configured to accept and enable remote
write. Remote write is not turned on by
default. So we have to tell Prometheus
to expose that endpoint. So open
telemetry can send metrics to it. If we
go with option A, we have to go ahead
and set up a scrape target in our
Prometheus to tell it to scrape metrics
from this collector. So both of them
have their pros and cons. So you still
maintain the option of having a pull or
a push model. Open telemetry just
facilitates the collection of metrics
and forwarding it on to storage. So it
doesn't replace Prometheus. The one cool
thing that open telemetry can do however
is when you start enriching logs, traces
and metrics and you start stitching them
together. Open telemetry allows you to
inject trace ids. So you can go from a
trace to a log or you can go from a
trace to a metric. That's some of the
power that open telemetry provides.
Therefore, it helps using open telemetry
out in the wild to collect logs,
metrics, and traces. So, you can start
gluing these three things together. Now
that we have a receiver to collect the
metrics, we have a processor to process
it, we have an exporter to send our
metrics to a Prometheus database, we can
now finally bring this all together and
enable this with a metrics pipeline. And
this is where the configuration value of
a service comes in. So after the
receivers, processes and exporters, we
finally take a look at services. And a
services basically allows us to enable
features of our configuration. And in
our case, we want to enable a pipeline.
We have trace pipelines. We've taken a
look at how to configure a trace
pipeline in our tracing guide. And then
we have a metrics pipeline. So here I
say which receiver I want to use. I'm
going to use the same receiver. So you
can see all our traces and metrics will
come over the same endpoint. I'll enable
my batch processor and I'll export my
metrics to my Prometheus remote write
exporter. So my open telemetry collector
will send the metrics out. And I'll also
use my file exporter as well just for
troubleshooting. And that's as easy as
that. The service pipeline brings it all
together for us and enables our
receivers, processes, and exporters. And
just to show you, my docker compose file
also has a Prometheus instance. I run
the latest version of Prometheus. I give
the container a name. I expose port
9090, which is the default port for
Prometheus. I enable the remote write
receiver. That is something you have to
do because it's not enabled by default.
And then I pass in my configuration file
I want to use and I mount it in using a
volume. And that Prometheus config is
very straightforward. But in our case,
we're using option B, which is to
receive metrics from remote, write. So
the only thing I'm doing here is setting
up a scrape interval, although I'm not
scraping anything. For option A, I
technically don't need any config
because my Prometheus instance will just
sit an idle and it will receive all the
metrics via remote write. If I were to
follow option A, I would have to come in
here and set up a scrape config and
point it to my hotel collector on that
port that I enabled in our exporter.
This option will allow Prometheus to go
ahead and scrape open telemetry. And our
open telemetry will do all the
collection of the metrics. And getting
this all up and running is very simple.
I just say docker compose up. And
that'll go ahead and run my whole
video's application which has tracing
logs and metrics enabled. This is the
solution we've been using throughout
this open telemetry series. So with
Docker Compose up and running, I can
open up my browser, go to local host,
and we can see our videos catalog
application is running. I can hit
refresh a couple of times to generate
some metrics. And I can head over to
Prometheus on port 9090. And if I type
HTTP, we can see we already have some
HTTP statistics coming through. Some
metrics is being received by this
Prometheus instance over remote write
via open telemetry. I also have a small
graphana instance running as part of
this docker compose which you can take a
look at. You can access graphana over
localhost port 3000. The username and
password is just adminadmin. You can
click on dashboards and then open the
open telemetry http dashboard and you'll
see some HTTP metrics coming through for
our route. If you refresh the videos
page for a couple times, we take a look
at the last 15 minutes. We can see that
we now can see our requests coming in,
our success rate, number of requests on
the route, and our percentiles of
latency. So that's just a simple
graphana dashboard that you can play
around with. Now, if you want to follow
along with the series, the source code
for this is on GitHub. You can head over
to the Docker Development YouTube series
GitHub repo and you can go down to the
monitoring folder. In the monitoring
folder, we have an open telemetry folder
with a readme. This readme has a link to
the guides on logs, how open telemetry
works, the tracing guide, as well as the
metric guide over here. And this will
take you to the metrics guide. It'll
tell you how to build the applications,
run them with Docker Compose, generate
some traffic, and how to access
Graphana. Now, hopefully this video
helps you understand the basic
fundamentals on how to deal with metrics
using Prometheus and Open Telemetry. Let
me know down in the comments below what
has your experience been with Open
Telemetry and what has your
observability platform look like and
what sort of content you would like me
to cover in the future. And if you like
the video, be sure to like and subscribe
and hit the bell so that you know when I
upload next. And if you want to support
the channel even further, be sure to hit
the join button down below to become a
YouTube member. And as always, thanks
for watching and until next time, peace.
[Music]
My DevOps Course 👉🏽 https://marceldempers.dev Patreon 👉🏽https://patreon.com/marceldempers Checkout the source code below 👇🏽 and follow along 🤓 Also if you want to support the channel further, become a member 😎 https://marceldempers.dev/join Checkout "That DevOps Community" too https://marceldempers.dev/community Source Code 🧐 -------------------------------------------------------------- https://github.com/marcel-dempers/docker-development-youtube-series Like and Subscribe for more :) Follow me on socials! https://marceldempers.dev X | https://x.com/marceldempers GitHub | https://github.com/marcel-dempers LinkedIn | https://www.linkedin.com/in/marceldempers Instagram | https://www.instagram.com/thatdevopsguy Music: Track: Reckoner - lofi hip hop chill beats for study~game~sleep | is licensed under a Creative Commons Attribution licence (https://creativecommons.org/licenses/by/3.0/) Listen: https://soundcloud.com/reckonero/reckoner-lofi-hip-hop-chill-beats-for-studygamesleep Track: souKo - souKo - Parallel | is licensed under a Creative Commons Attribution licence (https://creativecommons.org/licenses/by/3.0/) Listen: https://soundcloud.com/soukomusic/parallel