Top 10 DevOps & AI Tools You MUST Use in 2026 | DailyDevLists

Loading video player...

Full Transcript

6,140 words • EN

2025 was the year Aentic AI went from

interesting experiment to daily reality.

AI agents stopped being autocomplete

tools and started understanding entire

code bases, refactoring across files,

writing tests, debugging their own

mistakes, managing infrastructure and

handling operational tasks. For many

developers and ops engineers, the way

they work fundamentally

changed. 2026 is the year to take this

serious, not just for application

developers, but for DevOps engineers,

SRRES, and platform teams. The tools are

mature enough now. The productivity

gains are real. If you're not

integrating AI agents into your

workflow, you're leaving significant

significant value on the table. But

here's the thing. Aentic AI doesn't

replace everything else. You still need

solid foundations. Internal developer

platforms, testing frameworks, scripting

languages, development environments.

These tools still matter. What's changed

is that AI now intersects with all of

them. So this year's recommendations

cover both the AI tools that emerged in

2025 and the nonAI tools that remain

essential. I spent 2025 testing all of

them in real projects, real workflows,

real problems, not quick demos. What I

am sharing today are my recommendations

for 2026. It's not a comprehensive list,

not a natural comparison. Those are the

tools I actually use, the ones that

survived months of daily work, and the

ones I think you should seriously

consider. I will also point out what

didn't make the cut and why. Some of my

choices will be obvious, others might

surprise you. A few will probably make

you disagree, and that's fine. The goal

isn't to tell you what to think, is to

give you a practitioner's perspective so

you can make better decisions for your

own stock.

The AI model landscape is evolving at

the pace that makes any definitive

ranking obsolete within months, maybe

even weeks. The model that leads

benchmarks today will likely be

surpassed tomorrow. New releases drop

constantly, capabilities leap forward,

and pricing changes overnight.

Any recommendation here comes with a

built-in expiration date. That said,

some patterns are emerging. Certain

model families consistently deliver top

results for software engineering tasks.

understanding code, generating accurate

implementations, reasoning debugging

scenarios, and working with

configurations and manifests. The

leaders in coding benchmarks tend to

stay near the top even as new models

appear, and that suggests that some

providers have figured out what matters

for engineering work. For open source

options, you can run locally or

self-hosted on your own servers. Llama 4

offers a massive 10 million token

context window. Quen delivers strong

coding performance with an Apache 2.0

license and DeepSseek provides

impressive impressive capabilities at a

fraction of the price. Those models are

good and they're getting better, but

they're still not quite on par with

closed models. They might be close, but

not there yet. The other challenge is

compute requirements. You might not have

the hardware needed today and those

requirements are likely to increase as

models grow larger. Among the

proprietary alternatives, open air GPT

remains widely used with the largest

ecosystem. Mistral offers a strong let's

say a strong European alternative with

frontier performance at reduced costs.

Coher focuses on enterprise rug use

cases and XAI gro brings real time data

access. That being said, anthropic

models are my choice for software

engineering work. Clot consistently all

the time leads SV bench and other coding

benchmarks and it is often by a

meaningful margin. This isn't

accidental. Antropic has made software

engineering a core focus in a way that

other providers haven't. Their first AI

conference was entirely dedicated to

coding and developers. Cloud code is the

best terminalbased agent available.

Cursor the leading AI native IDE uses

code as its default model or at least it

used maybe it changed recently. I'm not

sure. And the results speak for

themselves. Cloud understands code bases

deeply generates accurate

implementations and reasons through

complex debugging scenarios better than

alternatives. for infrastructures code,

Kubernetes manifests, and configuration

files. The difference is noticeable. The

model understands context and produces

code that works on a first try more

often than competitors. Closet 4 now

supports a 1 million token context

window, matching Gemini's previous

advantage in this area. Now, the

trade-off is cost. Plot's API pricing is

higher than Geminis's and rate limiting

on the consumer plans can be frustrating

for heavy users, but for professional

software engineering work, the quality

difference justifies the price. Google

Germany is a close second. Cost

effectiveness is excellent with a

generous free tier and competitive API

pricing. Multimodel capabilities are

strong. They're really good. But if you

need to work with images and diagrams or

documentation that includes visuals. Now

Gemini can feel less refined than code

for some coding tasks. The output

quality is it's good but not quite at

close level for complex software

engineering work. That said the gap has

been narrowing and for many use cases

Germany delivers excellent results at

lower cost. If budget is a primary

concern, Germany is a good not good

solid choice.

We are witnessing a seismic shift in how

software engineers work. AI agents have

moved from novelty to necessity for many

application developers. They're not just

autocompleting lines anymore. They are

understanding entire code bases,

refactoring across multiple files,

writing tests, and even debugging their

own mistakes. The productivity gains are

real and they're substantial. However,

this shift hasn't fully reached the ops

world yet. DevOps engineers, SRRES, and

platform teams are still largely working

the way they did before AI agents became

mainstream. Part of this is the nature

of the work. Ops tasks often involve

production systems where mistakes have

immediate consequences, complex

debugging across distributed systems,

and tribal knowledge about why things

are configured a certain way. AI agents

struggle with context that lives in

runbooks, slack threads, and people's

heads. That said, 2026 is likely the

year this changes. The tools are

maturing rapidly. Context windows are

expanding and agents are getting better

at understanding infrastructures code

and Kubernetes manifest and cloud

configurations. If you're in ops and

haven't started experimenting with AI

agents, now is the time to get familiar.

The learning curve exists. It's there

and you don't want to be climbing it

when everyone else has already

integrated these tools into their

workflows. Now two directions are

emerging in this space. The first

follows traditional development patterns

through IDEs where AI augments the

familiar editing experience with inline

completions and chat panels and context

aware suggestions.

The second is terminal based agents that

take a different approach and they're

often operating more autonomously and

integrating better with command line

workflows that ops teams already use.

Another key distinction is model

flexibility. Some agents are tightly

coupled to specific models, which can

mean better integration for that model

strengths, but also vendor lockin.

Others are model agnostic, letting you

swap providers or use local models at

the cost of potentially less

optimization. In the ID camp, GitHub

copilot remains the most widely adopted

with seamless multi integration and

reliable completions.

Though it struggles with large scale

refactoring, Windinssurf offers a great

beginner experience with unlimited agent

access and planning mode for multi-step

tasks. Open source ID options like

continue and client provide model

agnostic alternatives but lack the

polish of commercial offerings. For

terminalbased agents tied to specific

models, Gemini CLA brings Google's

massive 1 million token context window

to the command line with the generous

free tier, though benchmark scores luck

behind competitors. OpenAI Codex offers

flexible reasoning levels and strong

GitHub integration, but has UX

challenges. Cloud specific options like

Amazon Q developer excel within their

ecosystems but offer limited value

outside them. My recommendation comes

down to three tools that cover different

use cases. I prefer terminal based

agents but some people prefer working in

IDs. So no judgment on that one at at

least not today. Within the terminal

camp, some want the best experience

regardless of vendor lockin while others

prioritize model of flexibility. All

three are valid choices depending on

your priorities. For ID users, cursor is

the clear clear winner. It's a VS code

fork with a deeply integrated into the

editing experience. You get inline

completions, chat, and precise context

control. The UI first experience is

polished and fast. The downsides are

rate limits that can be frustrating for

heavy users and pricing changes that

have upset some of the community. But if

you live in your ID and you want AI

assistance without leaving it, cursor is

the best option available. Heads down.

For terminal users who want best

experience, Cloud Code is in a league of

its own. It has the highest SV bench

scores, a true 200K token context window

and excels at autonomous multifile

operations. It understands entire code

bases and it can work through complex

tasks with minimal handholding. The

catch is that it only works with

entropic models. If you're comfortable

with that lockin, cloud code delivers

results that other terminal agents

cannot match. They can't. I use it

daily. Now for terminal users who want

model flexibility, open code is the best

choice. It is a true open-source cloud

code alternative that works with any

model provider. It's still behind cloud

code in capabilities and it has a

smaller community but is the right

choice if you want to avoid being locked

to a specific family of models. As the

model landscape continues to shift,

having the freedom to switch providers

without changing your tooling has truly

real value.

We enter the phase where companies need

to build their own AI agents.

Off-the-shelf agents are great for

general purpose tasks, but every

organization has unique workflows,

internal tools, and domain knowledge

that generic agents cannot tap into.

This is especially true for internal

developer platforms. If application

developers are increasingly using AI

agents as their primary interface for

getting work done, then platform teams

need to expose platforms capabilities to

those agents. Otherwise, developers will

be constantly context switching between

their AI assistant and the platform

portal and that defeats the purpose of

both. The way to bridge this gap is

through model context protocol or MCP

servers and custom agents. MCP allows AI

agents to discover and use tools exposed

by your platform. Custom agents can

encode your organization specific

workflows and policies and tribal

knowledge and together they enable

scenarios like developers asking their

AI agent to provision a new environment

to check deployment status or

investigate a production incident and

all that without leaving their normal

workflow because remember their

workflows are atic. Now, building custom

agents requires SDKs and frameworks. The

landscape here is still maturing with

options ranging from low-level SDKs that

give you maximum control to highlevel

frameworks that handle orchestration,

memory, and tool management for you. The

right choice depends on how much

complexity you want to manage yourself

versus how much you want abstracted

away. Now given how fast the model space

is moving, I believe we should be

building agents in a way that allows us

to switch models at any time. That makes

SDKs tied to a specific vendor a bad

choice, even if they offer features that

might not be available elsewhere.

Until clear, long-term winners in the

model space emerge, our agents must be,

and I repeat, must be agnostic. All that

rules out vendor specific options like

Contropic SDK and Google ADK and OpenAI

agents SDK as primary choices despite

and I repeat despite their polish and

tight integration with their respective

models. Microsoft semantic kernel and

autogen offer more flexibility but still

lean heavily into the Azure ecosystem.

For multi-agent orchestration, crew AI

has gained some traction with its

intuitive role-based approach. Lang

graph offers lung chain with graph-based

architecture offering low latency and

time travel debugging for simpler use

cases with type safety. Pentic AI offers

a fast API like experience. If you're

building for Kubernetes environment,

specifically K agent is the first

open-source agentic AI framework

designed for Kubernetes. And now a CNC

project with built-in integrations for

Argo, Kelm, East, and Prometheus.

However, I think K agent has a lot left

to be desired. It's useful for those

wanting to create agents quickly, but

not for custom agents that can be taken

seriously. There's more to agents than a

way to define a system prompt and

connect it to MCPs. KMCP is more

interesting as it helps you build, test,

and deploy MCP servers to Kubernetes

with proper life cycle management

through CRDs. That's an interesting

choice for building custom agents.

Versel AI SDK stands out as the best

choice. The primary reason is model

agnosticism. It supports dozens of

providers like OpenAI, Antropic, Google,

Coher, Deepseek and many more. Through a

unified API, you can swap providers

without changing your code, which aligns

with the principle that our agents must

not be locked to specific models. Beyond

provider flexibility, Versel AI SDK

offers simplicity that other frameworks

lack. where long chain requires

instantiating objects and managing

abstractions where Cell SDK uses simple

function calls with less boiler plate.

The learning curve is gentler streaming

is built in with React hooks like use

chat and use completion and that makes

realtime UIs trivial to implement. It

works across frameworks including NexJS,

React, Swelt, Vue, VU, whatever is

pronounced, NXT and NodeJS. The SDK is

actively maintained with AI SDK 5 adding

aentic loop control and type safe chat

and tool enhancements. The documentation

is excellent and if you need long chains

complex orchestration for specific use

cases, Versella SDK integrates with it,

giving you the best of both worlds. The

downside is that it is TypeScript first.

If you're not familiar with TypeScript,

you will need to either learn it or look

for an alternative. That being said, if

you're an experienced developer, picking

up a new language shouldn't be a

significant barrier, right?

Now, despite the name, this category

isn't just about reviewing application

code. It covers AI powered review of

anything you push to get. Kubernetes

manifest, terapform configurations, Helm

charts, CI/CD pipelines, shell scripts,

documentation.

If it lives in a repository and it goes

through a pull request, these tools can

review it. This is important. AI code

review is a nobrainer adoption. Nob

brainer. The time investment is close to

zero since you just enable it on your

repositories and it starts working. It's

not obtrusive either. While these tools

can often suggest fixes you can apply

with one click. Their main focus is not

that. It is providing recommendations.

You can take them into account or

dismiss them. There's no forced workflow

change, no new tool to learn, no context

switching. The AI reviews your PR in

parallel with human reviewers and you

decide what feedback is valuable. Easy

for ops teams. This is particularly

useful. Misconfigurations in Kubernetes

manifest, security issues in Terraform,

missing best practices in Helm charts.

These are exactly the kinds of issues

that slip through human review because

reviewers are focused on the logic, not

the YAML structure. A reviewers don't

get tired and they don't skip the boring

parts. The options range from simpler

tools like sorcery and codeent AAI that

handle basic reviews to more

sophisticated solutions. Tracer

identifies edge cases and performance

issues but requires a paid plan after

the trial. Grapile builds a full

semantic graph of your repo for

crossfile bug detection with sock 2

compliance though it can be noisy very

noisy. Cubic dev focuses on speed and

learns from your feedback but is GitHub

only. Code emerge on the other hand

which is formerly I think PR agent

something like that. Anyways, it is open

source and benchmarks as the fastest and

most thorough with rag powered searches

ac across all the repos. Now, my choice

is something else. My choice is called

rabbit. The primary reason is ease of

adoption. Setup takes minutes or seconds

with minimal configuration. It works

across GitHub, GitLab, Azure DevOps, and

Bitbucket. The reviews provide line by

line feedback that resembles what you

would get from a senior developer and

not just highlevel summaries. It learns

from your interactions over time,

adopting to your codebase and team

preferences. What sets code rabbit apart

is its MCP server integration. That's

the part I like the most. You can

connect it to cloud code or cursor or

any other MCP compatible agent. This

means that you can write code in your

agent, create a PR, get code rabbit's

review, then ask your agent to fetch

those review comments and implement the

fixes and all that without leaving your

workflow. The loop between writing code

and addressing review feedback stays

within a single agent session. For teams

already using AI agents for development,

this integration is significant. Code

merge is a solid alternative, especially

if you need self-hosting for strict

security requirements or you want

open-source transparency. It uses rug to

search across repositories for context.

But Cod Rabbit's ease of setup, MCP

integrations, and broad platform support

make it the better default choice for

most themes. The pricing is reasonable

with the free tier for basics and paid

tiers for more features.

The database landscape has shifted

beyond traditional SQL and NoSQL

databases. There's now a pressing need

to provide data to AI models through

agents. Due to context limitations, you

cannot just dump all your data into an

LLM and hope for the best. You need to

find the data that matters for each

specific query, which means semantic

search through embeddings. This is where

vector databases come in. They store

embeddings which are numerical

representations of text or code or

images or any other context and let you

find semantically similar items quickly.

When a developer asks an AI agent about

a production incident, the agent can

search your runbooks and past incidents

and documentation using semantic

similarity rather than keyword matching.

The results are dramatically better. And

the market has responded in two ways.

Dedicated vector databases have emerged.

They're purpose-built for storing and

quering embeddings at scale. At the same

time, existing databases are adding

vector capabilities, so you don't have

to manage another system. If your data

already lives in PosgrSQL, adding PG

vector might make more sense than

migrating to a dedicated solution. If

you want to add vector capabilities to

databases that you already run, PG

vector extends possides

edge native serverless vectors. If

you're in the Cloudflare ecosystem,

that's the only requirement. it's a

bugger. These work well for smaller

scale and when you want to avoid

managing another database. For dedicated

vector databases, Chrome offers the best

developer experience for rapid

prototyping, but it is limited to

smaller data sets. Milv scales to

billions of vectors but requires

engineering expertise. VV8 was first to

market with a deep feature set including

hybrid search and multimodel support.

Pine cone is the easiest fully managed

option with zero ops and strong

compliance certifications but costs they

can grow quickly at scale. My choice is

quadrant for vector database work at

least. It hits the right balance between

performance features and cost. Written

in Rust, it delivers excellent query

speeds with minimal latency. The

filtering capabilities are what set it

apart. You can filter on payload values

before the vector search happens not

after. This means queries like hey find

similar documents from the last 30 days

or hey find similar incidents in the

production environment

fast really fast regardless of how many

vectors you have. And the open source

model matters here. You can run Quadrant

locally for development or you can

self-host it in your own infrastructure

or you can use Quadrant cloud if you

want a managed option. This flexibility

is important for platform teams who need

to keep data inhouse or control costs at

scale. Pine cone is easier to get

started with but the costs grow quickly

as your data grows. Quadrant is

significantly cheaper at scale while

delivering comparable similar

performance. Now there are trade-offs.

They exist. Quadrant has a steeper

learning curve than simply adding a PG

vector to your existing POSSQL. If you

only need basic vector search on a small

data set, PG vector is simpler. VV8 is

better if you need sophisticated hybrid

search combining text and vector

queries. But for most AI agent use cases

where you need fast filtered semantic

search at reasonable cost, Quadrant is

the best choice.

Internal developer platform or IDP is

one of the most misunderstood concepts

in our industry. Too many people equate

IDP with a portal, a fancy web UI where

developers click buttons. That's not a

platform. That's just a front end. To

understand what the platform really is,

look at public cloud providers like AWS,

Azure, or Google Cloud. They follow a

clear pattern. services that do

something like EC2 spins up VMs or S3

stores objects or RDS manages databases.

Then we have APIs that expose those

services. And finally, we have user

interfaces that consume those APIs. That

could be web console or CLI or SDK or

Terraform or whatever it is. The UI is

just one of many ways to interact with

the platform. It's not the platform

itself. An IDP must follow the same

pattern. You need services that actually

do things like provision infrastructure,

deploy applications, manage secrets, and

enforce policies. Those services must be

exposed through APIs. Then you can build

whatever user interface makes sense.

Whether that's a portal, a CLI, GitHubs

workflows, or all of the above. If you

start with a portal, you're building a

house starting with the roof. So where

do you run these services? Kubernetes

controllers are the most logical choice.

They're designed to reconcile desired

state with actual state. Exactly what

platform services need to do. How do you

expose APIs? Well, Kubernetes custom

resource definitions or CRDs give you a

declarative API for free. How do you

interact with those APIs? Well, any way

you already interact with Kubernetes,

cube control, helm, githubs tools like

cargo city or flags, dashboards or

custom web UIs, the portal becomes just

another client, not the center of the

universe. Now, many options in this

space offer partial solutions or are

built on what I would consider obsolete

foundations. If you only need only and

exclusively a developer portal rodei

port cortex can work but the portal

alone is not the platform commercial

only one solutions like let's say

hardness or MIA platform and covery for

example they bundle various capabilities

together but often with proprietary

architectures that don't align with how

modern platforms should be built.

Humanity and Northflank get closer to

true platform orchestration but still

come with vendor locking concerns.

Actually scratch humanit

don't do it. North link let's say now if

you're building a platform in 2026 you

should be building it on Kubernetes with

Kubernetes native components from the

CNCF ecosystem. Services should be

controllers APIs should be CRDs and the

entire stack should follow the patterns

that Kubernetes established. Anything

else is either a partial solution or a

step backward and that's when we are

coming to the real deal the backtock

which is backstage Argo CD crossbank and

converter that's my choice for building

internal developer platforms each

component has established itself as the

leader in his domain backstage is

probably the only widely adopted portal

solution with massive contributions from

thousands of organizations it provides

the developerf facing UI layer

Argo CD uh together with flux is the def

facto standard for GitHubs handling

continuous delivery and keeping cluster

states synchronized with git

repositories. Crossplane is the most

mature and widely adopted solution for

building platform services as Kubernetes

controllers with APIs exposed through

CRDs. And finally, Cerno has established

itself as the standard for defining and

enforcing policies across Kubernetes

resources. All four projects are open

source and owned by CNCF. Argon Crossman

are graduated projects while Backstage

and Cavverno are incubating and on track

for graduation. They're mature, they're

widely adopted and they're

wellmaintained. The ecosystem

integration between them is strong with

crossplane providers spanning cloud

platforms and databases and SAS

applications and all manageable through

backstage portals with enforcing

security policies. The only significant

piece missing from the back stack for a

complete IDP is workflows or CI

pipelines. That space is already covered

by a plethora of tools that have existed

for a very long time and they all do

more or less the same thing. Whether you

choose GitHub actions, GitLab CI, Jenke,

Steon or any other CI tool matters less

than getting the platform foundations

right now. The investment required to

build a backstack

I'm not sure how it's pronounced

backstack let's say IDP is real but less

daunting than it might seem. Most

companies running Kubernetes are already

using Argo CD for deployments and

Cavverno for policies. Extending their

usage beyond current workloads is easier

with crossplane than with non-

Kubernetes native tools since the

patterns and workflows are already

familiar. And when it comes to developer

portals, there is no real alternative to

backstage. With the backstack, you get

full control, no vendor locking, and a

platform built on the same patterns as

public cloud.

The gap between local development and

production Kubernetes environments has

always been painful. You can run a local

Kubernetes cluster with Kind or miniQue,

but that doesn't help when your service

needs to talk to dozens of other

services and databases and message cues

and external APIs that only exist in a

shared environment. You end up mocking

everything, which means you're not

actually testing against real

dependencies or you deploy to a shared

dev cluster for every change, which is

slow and creates conflicts with other

developers. This category covers tools

that bridge that gap. The approaches

vary. Some create VPN tunnels to remote

clusters. Some intercept traffic and

route it to your local machine. Some

spin up isolated virtual clusters. And

some automate the build, deploy, test

cycle. The goal is the same. Let

developers work locally with the speed

and convenience of their own machine

while still interacting with real

services in a real cluster. For platform

teams, this is an important piece of the

developer experience puzzle. If

developers cannot easily test their

changes against a realistic environment,

they will either skip testing, which is

leading to production issues, or demand

expensive dedicated environments, which

is leading to infrastructure madness.

The right tooling here pays for itself

quickly. The oldest approach is VPN

based tunneling. Terresence for example

pioneered this space but it comes with

finicky setup compatibility issues with

service meshes and corporate VPNs and uh

it requires root access or Jira or

whatever it's pronounced was born from

teleresence frustration offering a

simpler dockerbased approach that

doesn't modify running workloads for

automating the build deploy test cycle

scaffold brings Google back maturity

with declarative YAML configuration Tilt

offers a browser UI showing build status

and logs with Starlark configuration for

flexibility. Devspace provides file sync

port forwarding and dev containers

across all major clouds. At a higher

level, Signot creates intelligent

sandboxes integrated with CI/CD for PRs.

VC cluster takes a different approach

though by spinning up virtual Kubernetes

clusters with fast provisioning and

strong resolution than namespaces. Now

my choice is mirror D and that choice is

for bridging local development with

remote Kubernetes environments. The key

difference from teleresence and similar

tools is that mirror D works at the

process level rather than the network

level. Instead of creating a VPN tunnel

to your cluster, mirror D intercepts

your local process system calls and

proxies them to a temporary agent

running in your cluster. Now this

approach has several advantages. No

cluster installation is required. Mirror

uses the Kubernetes API directly. So all

you need is a configured cube config. It

creates a temporary pod when it runs and

cleans up automatically when it's done.

No operators, no demons, no permanent

changes to your cluster. And guess no

root access is needed on your machine

either. Unlike teleresence which

requires elevated privileges to create

network tunnels. MirrorD only affects

the running process. The rest of your

machine remains untouched. This makes it

easier to adopt in corporate

environments where developers don't have

admin rights. Traffic mirroring instead

of interception is a big deal for shared

environments. Teleresence intercepts

traffic meaning requests intended for

the remote service get redirected to

your local machine. This disrupts others

using that environment. Mirror can

mirror traffic instead sending a copy to

your local process while the original

requests are handled normally by the

remote service. What else? Oh yeah,

environment configuration is automatic.

Mirror D proxies, network access, file

access and environment variables

uniformly. Your local process sees the

same environment variables can read the

same files and connects to the same

services as the remote pot.

As the pattern of building platforms on

Kubernetes with services exposed through

CRDs becomes more prevalent, testing

Kubernetes resources becomes more

critical. This isn't just about testing

applications running on Kubernetes. It's

about testing the platform itself, the

controllers, the operators, the custom

resources, the policies, and the

integrations between them. When you

combine this with the Gentic AI that

interacts with your platform through

these APIs, the stakes get higher. They

get very high. An AI agent provisioning

infrastructure or modifying

configurations needs to work correctly

every single time. The API it calls need

to behave as documented. The controllers

behind those APIs need to reconcile

state reliably. You cannot manually

verify this at scale. You need automated

end-to-end tests that exercise the full

life cycle of your custom resources. The

tooling in this space has matured

significantly. Declarative testing

frameworks let you define expected

states and assertions in YAML rather

than writing tests code. Conformance

testing ensures your clusters meet

Kubernetes standards. The best tools

make it easy to turn a bug report into

regression test by simply copying

manifests. Now within that space for

basic Helm chart validation Helm test is

built in but uh it's limited to simple

pass fail checks Helm unit test adds BD

style unit testing as a helm plug-in ct

or cuttle c I think probably was the

original declarative kubernetes test

tool but development had slowed

significantly almost non-existent for

cluster conformance testing son boy

ensures your cluster meets kubernetes

standards with non-destructive

diagnostics. The official Kubernetes C2

framework provides codebased testing

libraries with automatic cluster life

cycle management though it requires Go

knowledge but Go is okay. Go is a cool

language. Ko chainsaw is my choice for

testing Kubernetes platforms. It builds

on ideas from K and improves on them

significantly. The core idea is

declarative testing. You define tests in

YAML rather than writing go code or bar

scripts. Now this matters for platform

teams. When you build a platform with

custom controllers and CRDs, you need to

test that resources reconciled

correctly, that policies are enforced,

that the entire life cycle works as

expected. Writing this in Go means

maintaining more test code than actual

platform code. Chainsaw lets you define

test cases by simply providing the

manifests you want to apply and the

expected state you want to verify. The

workflow for turning bug reports into

regression tests is remarkably simple.

Someone reports that a specific manifest

causes unexpected behavior. Cool. You

copy that manifest into a test case, add

the expected outcome and you have a

regression test. No code to write. Each

test step is isolated which makes CI

debugging easier. When a test fails, you

know exactly which step failed and you

can see the relevant logs without going

through the monolithic test output. The

documentation is detailed and actively

maintained with frequent releases. Now

complex assertion logic might require

scripting blocks rather than pure YAML,

but that's a rare edge case. For most

platform testing scenarios, declarative

YAML definitions are sufficient and

dramatically simpler than the

alternatives.

Every DevOps engineer faces the same

dilemma. Should they write this in bash

or should they use a real programming

language like Python or Go? Bush is

everywhere and perfect for quick glues

type of scripts, but it falls apart when

dealing with structured data with error

handling or anything beyond simpler

string manipulation. Python is powerful

and it's readable, but suddenly you're

managing virtual environments, you're

managing dependencies, and you're

wondering if the target system even has

the right Python version. Go gives you

static binaries, but the overhead of

writing, compiling, and maintaining

compiled code for a simple automation

task feels like overkill. The truth is

most modern DevOps work increasingly

involves structured data, JSON from

APIs, YAML configs, log parsing, cloud

CLI outputs. Traditional shells force

you into awkward pipelines of jq or oak

set and grap. In the meantime, real

languages require too much ceremony for

what should be a 10line script. This

category explores shells that bridge

that gap, offering the immediacy of

shell scripting with modern language

features like structured data handling,

proper error messages, and sane syntax.

This isn't a comprehensive list of all

shells, just the ones I explored while

looking for something better than Bash

that doesn't require spinning up a full

development environment. Now, I'll skip

the alternatives and just go straight

into it. After trying many solutions, I

settled on New Shell, at least for

scripting. It delivers the best of both

worlds. quick to write like Bash but

with proper data types and type checking

like Go and TypeScript. The key

difference is that Nell treats data as

structured tables, records and lists

rather than text streams. You can

filter, sort and transform JSON, YAML,

CSV whatever Excel SQLite whatever

you want with the same commands. No more

jq or set pipelines expressions like hey

where status equals running are much

more intuitive than parsing text with

reax. Now to be fair shell pioneer this

structure data approach and it deserves

credit for that but improves on the

concept. It's faster. PowerShell has

historically been very slow. The syntax

is cleaner and less verbose like you

don't have to use verb noun conventions

forcing awkward naming and it's built

for Unix like systems first rather than

treating them as secondclass citizens.

New shell is written in Rust so it's

performant with no runtime dependencies.

The error messages actually tell you

what went wrong and how to fix it. Now I

should be clear. I don't use NS shell as

my interactive shell. I still use CSSH

for that because it's posix compliant

and works everywhere. I use New Shell

exclusively for scripting where it's

structured data handling really shines.

The fact that New Shell isn't commonly

installed on servers isn't a problem for

me because I use Nyx through DevBox.

Every project has a devbox JSON that

brings in all needed tools, including

Nell. That said, I wouldn't use Nell for

scripts that need to run directly on

servers where I don't control the

environment. I don't have many of those.

So, for me, not an issue. Now, the

trade-offs are real. No job control yet.

Still pre uh version one. So, syntax can

change between versions and you cannot

copy paste B scripts. But for DevOps

work involving structured data, New

Shell has replaced both Bash and Python

for most of my scripting needs. Nell

users may not be the majority yet, but

it's too good to ignore. I strongly

recommend it.

So, here are my recommendations for

2026. For AI models, go with Antropics

Cloud for software engineering work. If

budget is a concern, Gemini is a strong

second choice. For AI agents, pick based

on your workflow. Cursor if you prefer

IDs, close code if you work in the

terminal and you want the best

experience, and open code if model

flexibility matters more than polish.

For building custom agents, use Versel's

AI SDK. Model agnosticism is critical

when the landscape is shifting this

fast. For automated code reviews, adopt

code rabbit. The MCP integration alone

makes it worth it. For vector databases,

choose quadrant. It's the best balance

of performance, features, and cost. For

internal developer platforms, build on

the back stack. It's Kubernetes native.

There is no vector locking and all are

CNCA projects. For Kubernetes

development environments, use mirror. It

solves the local to remote gap without

the pain of alternatives. For platform

testing, adopt Cavverno Chainsaw, which

is the tool for declarative testing that

actually works. For scripting, please

try New Shell. It offers structured data

handling without the ceremony of real

languages. Those tools survived real

world views in 2025. They're ready for

2026. That's what I will be using this

year. Thank you for watching. See you in

the next one. Cheers.

Top 10 DevOps & AI Tools You MUST Use in 2026

DevOps & AI Toolkit

54 days ago

44:29

Platform Engineering & DevOps Culture

Rank #1

Description

This video presents a practitioner's guide to the most essential developer tools for 2026, covering both the AI tools and the foundational technologies that remain critical. Rather than offering a neutral comparison, it shares battle-tested recommendations based on months of real-world use across AI models, coding agents, custom agent development, code review automation, vector databases, internal developer platforms, Kubernetes development environments, platform testing, and modern shell scripting. Key recommendations include Anthropic's Claude for AI-powered software engineering, Cursor or Claude Code for coding agents depending on your workflow preference, Vercel AI SDK for building custom agents with model flexibility, CodeRabbit for automated code reviews with MCP integration, Qdrant for vector database needs, the BACK Stack for building internal developer platforms on Kubernetes, mirrord for bridging local and remote development environments, Kyverno Chainsaw for declarative platform testing, and Nushell for modern scripting with structured data handling. The video emphasizes that while agentic AI has transformed how developers work, solid foundations like testing frameworks, development environments, and platform architecture still matter—AI now intersects with all of them rather than replacing them. #DevOps #AITools #Kubernetes Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join ▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/devops/top-10-devops-tools-you-must-use-in-2026 ▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below). ▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/ ▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox ▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 DevOps and AI Tools 2026 02:10 Best AI Models for Software Engineering 06:19 Best AI Coding Agents 11:45 Building Custom AI Agents 17:15 AI Code Review Tools 21:09 Vector Databases for AI 25:06 Internal Developer Platforms 30:26 Kubernetes Dev Environments 34:56 Kubernetes Platform Testing 39:31 Modern Shell Scripting 42:46 What to Use in 2026

Watch on YouTube