Loading video player...
All right, good morning everybody. Um,
thanks for joining. Um, so before we
start in I would like just to ask a
quick question like how many of you are
currently using like Kubernetes in your
organization as of today? Raise your
hands like okay pretty much most of us
right nice. So I think some of the
issues that I'm talking about would so
like if you've been using for quite some
time now probably you might be able to
uh sound familiar of these issues. So
the first one being let's say for
example when you're like let's say you
have a pipeline that is basically
tagging your images and all of those
right so let's say there could be some
cases where you are trying to like some
pipeline config is making changes like
tagging your image as latest and then
once you go into production right so
basically that's not the code that you
are expecting because it's a different
version altogether because the tag was
latest for example right or sometimes
you have an issue where let's say some
of your let's say uh you have deployed a
part and then it's crashing continuously
So the reason because when you look at
your describe right so basically it says
that there's no compute like let's say
resources available because I mean one
of the reasons that could be because
there are other parts in your class I
mean in your name space that are
basically using all of your compute like
memory or CPU and all of those within
your name space right that's one of the
reasons where there's no uh compute
available at all so because there's no
resource limit set on your in your pod
config that's where that's what is
happening right or sometimes you have
issues where your security team is
assigning you tickets right let's say
saying that hey you are using a
production like let's say non-root
you're using a root user in your pod and
then you need to fix that right or
sometimes you have images that are being
uh downloaded from your public uh uh
urls which you should not be doing that
so there's one ticket something like
that right so you have like different
issues that I was talking about right so
how can we and we are identifying all of
these issues once you're in production
right which means you're trying to after
deployment you're trying to fix those
issues and all which is basically all of
these are not like a what do you all
they're all common issues right they
could always be fixed ahead of time so
how can we do that is where I mean we're
going to be looking at shift left
approach in this session so where we
have an open source framework with all
of these policies and all of those so in
today's topic we're going to be covering
about what the problem is right let's
say why CS checks are required and then
how can we u let's say u use them to
basically like secure kubernetes
clusters and then we're also going to be
talking about an open source framework
that has all of these yes checks
automated and then we're going to with
Kano and then we're going to be seeing
how can we do that also we're going to
be looking at uh it's not not just
theory right we're going to be going
through a working demo where we're going
to spin up a cluster and then we're
going to try to automate it all the way
from zero to I would say like 99% under
15 minutes we're going to take a look at
that and then finally the metrics we're
going to look at an end toend pipeline
and then uh as well uh as part of this
coming to okay uh yeah let me introduce
myself. I am Yuganda Sutari. I spent
almost 18 years into IT and uh more than
six years into cloud security. I
currently work as a security engineer at
Cisco and uh yeah, I'm also a
contributor and reviewer on the Qo
project and Golden Cubastronet. So yeah,
I also actively blog at all
thingsincloud.com. Please check it out.
Yep, I think let's dive in. So coming to
Y CS benchmarks, right? So let's say so
when it comes to Kubernetes, right? So
it's basically a powerful but basically
not secure by default, right? It
basically provides building blocks for
your uh let's say platform engineers to
basically secure your cluster and all of
those using different options right so
basically so when it comes to CS
benchmarks so basically they provide us
with a standard baseline that basically
helps you secure your cluster right so
we're going to be looking at uh all of
those so when it comes to these uh
controls right so there's different
areas of focus let's say it could be
your control plane you have your worker
nodes there spot security there's roll
back so imagine all of these 64 policies
and all of those if you have to go
through each of them uh manually like
each of these clusters right so looking
at the scale right let's say for example
when you have uh thousands of clusters
right let's say it'll be hard for us to
uh verify all of these CS checks
manually or after deployment and all
it's going to be a problem so how can we
manage all of these at scale right so
that's where the solution comes that we
are taking is the shift left solution
that we have so which we are going to be
talking about that in detail
right so now coming to the shift
architecture right so here we have like
three-stage validations so plan time
validation right so at this stage so
most of the time when you are working
with uh kubernetes clusters right so
most of the time we generally either
using like open tofu or kubernetes so I
think most of the time infos has code
tools right so what if we could uh let's
say like instead of fixing the issues
that we are talking about right let's
say your eks cluster endpoints I think
most of the time you have a security
incident that says your eks cluster
endpoints are open to internet so you
have to fix those or so how about we can
fix all of these issues ahead of time
like or let's say audit is not enabled
and all of those right so we will look
at open tofu like we going to use open
tofu and then generally you have your
terraform plan right so you can
basically use that plan convert that to
a JSON file and then use node to
basically scan this JSON and then
basically tell it okay is this is the
blueprint plan that you have is this
compliant with my CS checks or is it
safe to be deployed right and then kerno
checks that plan and then you can
basically put that as a validation stage
step, right? So that way you're not
deploying anything that is unsecure to
your that's one that we're going to look
at that in detail. And then the next one
is deployment time, right? So now that
you have your cluster created, so the
next one is going to be um like how can
we like when you're when the developers
are deploying their workloads and all
right at runtime. So how can we block
all of those parts and all of those at
runtime. So and we have all of these
checks that we we're going to be loading
and then Q node does that during
admission process which we're going to
look at that as well in detail and then
uh we have runtime validations right so
when it comes to runtime validations you
have your cluster where you're running
like lot of compute nodes and all of
those so basically there are some checks
that we need to do in terms of like 13
checks CS checks that you have to look
at file validations file permissions or
audit configurations and all of those
which given cannot access but those
things I think I have a node scanner
that is basically deploy as a demon set
in your cluster that basically does
these checks and all of those but the
first two stages and all can be done
with given so we're going to be looking
at those and all of these are automated
in the open source project which we are
going to be looking at
all right so yeah so I thought I would I
know I think we had all the previous
sessions where we have we looked at the
basic steps but let me just quickly walk
through uh the Kubernetes uh flow right
so and then where does kibo come in this
so whenever you as a user right when I'm
doing my cubectl apply commands let's
say right when you're trying to create
or update your files right let's
basically uh the first thing that we do
is you send a request to your API server
and then you have the first one is you
have authentication authorization
happens and then your workload enters
into your admission process right
admission controller let's say and
that's where Qo comes in actually so in
there are two major components in your
admission controller one is a mutating
admission and the second one is your
validation admission so when it comes to
a mutation admission right so that's
this is where you app where Kano can
basically add in security defaults like
blocking like let's avoiding adding like
a non-root user or maybe could be in
terms of some annotations or labels that
are required and all of those. So you
can that it can take care of that as
part of the mutation admissions and all
of those and you have validation
admissions. So this is where actually
Kano enforces all of or blocks your
parts like if they do not meet your
policy and all of those right that's one
thing that we can do and then so how
does Kano so basically what Ko is doing
here is basically it's acts as a policy
guard for your cluster at the admission
process and making sure that let's say
and how does how it does is basically
it's a web hook based admission
controller and which includes you have a
ci admission controller which takes care
of all of those mutations validations
and all is taken care by admission
controller And then you have background
controller, right? Let's say you have
some of the workloads that are already
running. So to be able to check those,
you have this background controller
which we're going to be looking at that.
And there's cleanup controller as well
which basically cleans up any uh let's
say um workloads and all that are not
used anymore and all of those or expired
workloads. And then you have reports
which basically generates your uh CS
report like not CS reports but generally
your audit reports and all of that that
are required. And then once your
admission is complete so that's where it
enters into your uh HCD. So basically
what Kano is doing here is basically it
ensures that only your complaint secure
and policy aligned workloads enter into
your cluster. Right. So that's what Ko
is doing here.
All right. So now that we saw how like
where does Kono fall in uh like come
into picture in your Kubernetes flow.
Let's quickly take a look at the cluster
policies. Right. So let's I just took
one of the uh CS check policies that we
have to basically block the privileged
containers like some things that uh that
are very important here are which is
going to start with from the uh from the
first line right so your API version of
the kerno that you're using and then you
have the policy type which is basically
your cluster policy that's what I'm
doing here and then you have uh uh let
me quickly check and then yeah some some
things that that are very important here
is basically your validation failure
action like whether it's there's two
modes one is enforce And then you have
audit mode. Envos is where basically it
blocks and audit is it still allows but
basically it's uh you when you look at
the report you'll know that there's some
issue and all of those. We're going to
look at that in practice. Uh but and
when it comes to your cell policies
right so basically how does it evaluate
is on the left side if you see you have
I think um we are checking like let's
say object.spec.containers
and then dot you're checking if it has a
security context has a privilege uh
context and then is is set to true or
false. So basically your cuberno
verifies all of these uh like if you
look at the right side that's your pod
config. So it basically checks for
object that's a part object and then
spec.containers and then you have dot
security context dot your privilege. So
we're checking for has security context
has privilege and then like is it set to
true or false. So based on that your
during the admission process it
basically checks like whether it should
allow or deny this fraud and all of
those. So based on the mode as well like
enforce audit, right? That's what it's
doing here. And now uh yeah, so the
complete demo that we're doing, you can
scan this QR code for the open source
repository project. It has all of the CS
checks that are automated uh all of the
64 checks that we're talking about,
right? Runtime as well as plan time and
the node scanner. Uh so which are going
to be working walking through that. All
right let's
go on. Okay, I tried recording the
session as well. But let me walk you
through that.
All right, so this is our open source
project that I was talking about. Let me
just quickly walk you through uh the
repository architecture and stuff. So
starting with K8, right? This is where
you have all of the so I'm deploying a
node scanner. So which is basically as a
demon set to basically scan for the node
file permissions and all of those.
That's basically here. And then we have
open tofu uh here. So basically all of
the policies like let's quickly go
through with the open tofu right so this
is basically when you spin up a
kubernetes cluster and all of those you
generally are using kubernetes like
terraform or open tofu I'm just using
open tofu and then I purposefully I'm
deploying some of the non-compliant
parts here uh how we can like for
example you have no cloud watch logs or
no eks add-ons no network policies that
basically violates so which we're going
to see how ko can identify these uh
violations so I tried putting both of
them complaint and non-complaint just to
try it
And then we have policies. So this is
the important part. So all of the
policies that we were talking about
different types, right? Let's say
control plane, there's spot security,
there's arbback controls. So each of
them is basically um a policy and
they're all tested and then production
ready. So and like I said, they're all
using cell expressions as well to
validate all the checks and all of those
which you're going to be looking that in
runtime uh uh in in a minute. And then
you have your um
open tofu policies. So open tofu I think
this is more like you're generating a
plan and then you are converting it to a
JSON and then you're basically we going
to look at uh like when you're verifying
like how we write like what exactly the
policy that we are checking and all of
those like uh in like in real time like
probably in a minute that's one thing
and then you have your scripts here. I'm
just using these scripts to basically
test on the kind cluster. All of these
are integrated into the pipeline. So
when I'm doing so uh and then and all of
those checks are being done here and
then I also have created unit tests that
basically I'm testing it when I'm making
some changes into the cluster as well.
All the tests have like complaint and
non-compliant. So you can basically
integrate all of those uh within your
pipelines and all of those for your unit
testing or integration testing that can
be done with that. Right? So let's start
with uh
cloning. I just cloned this repository
before the session and then to just save
time and then I'm starting uh with the
practical demo. Right. So let's say
[clears throat] so the first thing that
I'm doing is just creating a client
cluster from scratch. So I already
created it just before like 5 minutes
before this session. So okay yeah so now
let's get started with the cluster
right. So you just have a Kubernetes
cluster that is up and running and let's
look check if there's any cluster
policies. So there's no cluster policies
like zero cluster policies right now.
And what we're going to be doing is
we'll basically create a a namespace and
then we're going to try to uh basically
uh deploy a non-compliant pod, right?
Which basically has recruit permissions.
Uh and then let's take a look at that.
Okay, it's still taking time, but so
we're going to be deploying this non uh
non-compliant pod and then uh install
kerno and then basically try to uh see
how it whether it blocks or identifies
that so and all.
So yeah, so if you see here, I think
there was no policies and all that are
being set up, but basically your pod is
still uh deployed without any issues
because there's uh no controls and all
in place. So that's part of that. So
which means you have like a uh an issue
that's basically in production. So now
when you have when we the phase two
we're going to be installing Kano. I'm
currently using 1.15.2 version. Uh so
let me just create I'm just installing
Kano. Uh so if you look at this right so
basically like we spoke about it
basically uh installs your given
admission controller there's background
controller cleanup and then reports and
all as you can see. Uh so like we spoke
about right basically you have your
admission controller which basically is
responsible for your mutations
validations and all of those which
basically and apart from that you can
also have your image verification and
scanning and all done as well like where
you can deploy from those and can be
taken care of those. So I'm just waiting
for the Kubernetes uh like ko parts to
come up. Right. So once we have this
kerno parts coming up, right? So what we
can do is after that we're going to see
uh like since we already have a
non-compliant part, we're going to check
if and I already have the background um
process as enabled to true. So basically
what it does is it also checks for any
issues with existing workloads as well
actually. And then it basically it
doesn't block it because it's already
running. So it basically puts it to the
rep to uh the repos. Uh let me just yeah
so here I'm just applying all of those
policies like all of the 49 CS checks
from the repository. Uh so let's give it
some time to basically load those
policies and all of these policies like
we saw right so basically are part of
your admission controller and then uh
that give users to basically validate.
Okay it's going to take some time. So
let's all right so yeah now that you see
you have all of those policies that are
deployed and now let's like once your
policies are deployed you can see that
I'm just trying to generate the report
reports so here you can see there's a
lot of uh failed so let's take a look at
one of the name space that we have so
this is the non-compliant part that we
deployed um and then you can see there
are a lot of checks that fa failed
basically some of them pass some of them
failed like and all of those so we can
probably use a describe command to
basically look at all of those policies
and all but let me so what we can do
what we're going to be doing is delete
the non-compliant part right and then
now let's try installing right so
basically previously it did allow and
now when we try to
uh run it right so basically you can see
it should have blocked that yeah so you
can see that there's an error um during
the admission process it will basically
say what what failed right let's say you
have your custom like 5.2 to that's
basically the CS check control number
but you have like denied privilege
containers are not allowed so you will
see like why it failed and all that
these are all part of the validation
messages that the user will get if
something is not complained right it's
basically blocking right and then I'm
also deploying a complaint pod which
basically shows you um oops uh like what
happened and all of so that's part of so
so as a phase one we just install like
non-secure parts then you're putting in
Qo into place and then you're applying
policies and then looking at whether
it's blocking or not. That's what we
did. And now moving on to the next
phase, right? Let's say we're that's
what that's about how you verify your
test at the deployment time, right? And
now we're going to be looking at uh plan
time tests. So for this, right? So let's
say I have my open tofu directory. So
that's what I'm just doing. So what we
generally do is we run tofu in it, which
basically generates like your
downloading your uh modules and then you
have tofu plan, which is basically
generating your plan.tf TF uh file and
then I'm just converting this to a JSON
file and then passing this JSON to Q no
actually and then saying hey check if
this uh cluster that I'm creating is
compliant or not. So that's what we're
going to check. So yeah you can see that
the plan is ready here. So with uh I'm
just deploying a non-compliant part.
We're going to check uh what happens.
Okay. All right.
I'm just generating a I'm looks like I'm
deploying both complaint and
non-compliant part. So you have like 18
resources that it's creating. All right.
So
all right
let's validate that. So what I'm the
only command that I'm running here is
basically kanojson and then scan the
policies for open tofu and then I'm just
passing it the tofu.plan that we
generally you will have like terraform
or open tofu can only do a tf plan just
convert that to JSON and then you can
basically use keon cla to basically uh
to scan your JSON file actually. And
then let's quickly take a look like you
can see that the plan says CS 5.1 failed
because for an IM authenticator failed
or there's a lot of checks that are
failing. Let's just take a look at how
we are comparing those right. So let me
go back to the repository. Um
so if you look at the off topu plan
so non-compliant let's go to
so this is your plan. So if you look at
the way we are validating is um
open to let's take one of the examples
right let's say if you look at the
terraform policies right let me just see
I'm just trying to see like the pretty
uh is it command shift P Uh
so what it does is basically it checks
for every like when you look at the plan
right so you basically have u uh the
value and then it basically checks for
okay let's say
n complent
okay so the way we check for that is
basically within the plan it checks for
the planned values dot it goes to the
root module and then the resources
within it and then let's say for the EKS
add-on right let's say for example it
checks for whether the VPC is set like
what values and all of those like
similarly when you for example for the
private cluster that we were looking at
right so it basically checks if the uh
endpoint is enabled like is false or
true and all of those and that's how it
basically validates uh your kubernetes
and all like yeah
and then it basically tells like whether
it's complaint or not so right so that's
what we're doing with kerno and then
let's quickly go to
All right. Okay. So that's with your
plan time test. So that's how you can
basically validate your uh plan time
test using open tofu and kerno. And now
moving on to the last one which is
basically runtime. Right. So we have
like a node scanner that I'm basically
deploying it as like a um uh demon set
within your cluster which basically
checks for all of your um u controls.
Okay. It's going to take some time but
so while you are waiting right so like
like like I said like we have all of
these uh plan time tests that we're
doing and then you have runtime test
with Qo and then the last one is
basically a node uh node scanner what so
QA cannot validate the file permissions
and all of those on the node. So
basically we using a node scanner which
is a a script that basically deploys a
demon set and all of those. Uh let me
just I'm just waiting for it to come up.
Yeah. So with that right so basically
you are basically performing everything
from scratch. We just created a kind
cluster deployed a non-complent pod
installed kerno and then verified if
it's blocking or not and then deployed
all of our policies like the complete CS
check policies and now let's quickly
look into and with that you have a comp
compliant cluster actually from end to
end from scratch. So now how can we
integrate all of these into a pipeline.
I just tried creating a GitHub actions
pipeline and whatever we just saw like
manually one by one I've tried to
integrate all of those into um uh like
GitHub actions. So you can basically
uh see I just ran it yesterday I
believe. Yeah. So all of these tests
that we looking at it right so basically
they're all integrated within your
pipeline like installing CLI and then
verifying your unit test and all of
those. And similarly with your plan
test, I'm just running your open TOFU
convert to JSON use cubode to basically
validate all of those. And then you have
your kind cluster integration that I've
done to basically check all of my CS and
all of those. And uh looking at the
reports, right? So you basically have
like a and in order to validate all of
my checks like whether it's working
correctly or not, I also have integrated
with Trivy and then using Triv C scanner
to basically check like the complaint
report and all of those with these.
Actually there are some failures I think
which uh it basically shows but that's
how you basically can uh do it from uh
end to end actually with all of the
reports and all. So
yeah I think so that's where I think the
cluster and then yeah any questions?
>> Yeah. Yeah, it looks like you're using
open tofu with the policies, but then
you're using to scan the policies and
apply it against the TF plan,
>> right?
>> But vero itself can write the policies.
Why not just have policies in term that
>> right so right so policies are again see
when when you talk about policies right
so basically you're talking about one is
uh CI check policies which you can write
in kerno as well but these are all like
let's say uh based on your terraform
plan that you're generating it's a
different what do you call uh I'm just
creating a valid like you can write
validation I'm even like with u open to
right I'm just writing a val kerno
validation policy itself so if you look
at one of these uh checks that we have
like let's quickly go through right I'm
also writing a kibo policy it's not like
a open tofu I'm not writing policies
with open tofu just to make it clear
open tofu is only used to generate the
plan and then I'm writing those checks
like if you look at this repository
come on yeah let's look at open tofu and
then I'm basically writing a kerno
validation policy actually uh
>> oh you're not sorry
So if you look at uh policies
you have open tofu and then you have
your let's quickly go to networking
right. So all of these are again open
tou I mean like kono policies like you
can see that it's basically still using
that kono API v1 alpha version with kind
validation policy and these are and your
open to cannot generate these right so
you you are basically I'm just writing
those actually I'm only using open tofu
to plan
>> so those are just policies
>> right
>> and then open what
>> open tofu is more like a terapform right
infrastructure as code tool that is
basically used to basically provision
your cluster and all of those
>> just gives you a
>> yes it just gives JTF plan.
>> So if I took the TF plan and I ran scan
against it, I don't really do open.
>> Yes. You don't need open. If you already
have the plan output plan, then you
don't need it. Yes.
>> Yeah. Um
>> yeah, you can please share your
feedback. Actually, you can scan the QR
code and then please give me feedback on
that. So yeah. Yeah. Any questions?
All
right. Thank you.
>> [applause]
Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon events in Amsterdam, The Netherlands (23-26 March, 2026). Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at https://kubecon.io Shift Left in Action: Automating CIS Compliance With Kyverno and CNCF Power Tools - Yugandhar Suthari, Cisco True shift-left security means catching compliance issues before they reach production. This session shows how to build a CNCF-native CIS compliance automation framework using Kyverno as the policy orchestrator, with kube-bench and OpenTofu for full-spectrum validation—from infrastructure plans to running clusters. Learn how CEL-based ValidatingPolicies enable unified plan-time and runtime checks, how to integrate OpenTofu for automated IaC scanning, and how kube-bench closes the loop with node-level audits. Live demos will showcase how this framework enforces CIS benchmarks across EKS, AKS, and GKE—proving that early, automated security validation truly scales.