Loading video player...
Why all of a sudden every model creator
is trying to create their own agentic
coding system within a CLI or terminal?
Is this the end of idees? And why it's
going to be very critical for openw
weight models? We're going to explore
all of this in this video. Now, like
most of you, uh I started my programming
career within an IDE. uh but over the
last couple of years we have seen this
trend of AI assisted coding ids which
started with cursor which is a clone of
VS code then we saw wind surf taii and a
whole bunch of more but earlier this
year anthropic did something very
interesting they created clot code which
is a coding agent within your terminal
now u don't get me wrong coding inside
terminal is nothing new uh people have
been doing this for ages especially the
more experienced folks have been using
terminal. So this was kind of a way of
entropic to reach their customers where
they are. We have seen companies like
OpenAI create tools like Codeex have
terminal based component as well as web-
based component. Now even cloud
introduce their own web uh based cloud
instance. Right? So the idea is to just
meet the developers or people who are
using these systems where they are. But
we are also seeing another very
interesting trend and that is for open
weight models. Now uh there are a number
of open-source coding agents. Uh client
comes to mind. Kilo code is another one.
These are really awesome coding tools.
But if you look at model creators like
Kimmy K2 which is from Moonshot or
Quinn, they are now trying to build
their own coding system within terminals
or CLI. So for example you have Quen CLI
or Moonshot released their own Kimmy
CLI. Now the question is why all of a
sudden even the open weight models
creators are trying their own uh
terminals instead of supporting these
IDE based systems or even uh a generic
openweight terminal or CLI based system.
Well, I think it comes down to how these
models are being trained and why the
existing systems don't really show the
full capabilities. So, let me give you a
quick example. If you look at something
like cloud code, you can actually use
some of the latest openweight models
include including Kimik2, Mini Max, M2
or even the Quen models. There are ways
to do it. However,
um clot code which is one of probably
the best implementation of an agentic
coding system out there. This is
specifically fine-tuned or designed for
clot models and the latest version is
sonnet 4.5. Now some of these
capabilities cannot be used by any open
weight model. So for example when claude
uh sonnet 4.5 was initially released
cognition dropped this blog post
rebuilding devon for claude sonnet 4.5
lessons and challenges and the basic
summary of this article is that they had
to go and rebuild Devon completely to
support sonnet 4.5 now the main reason
is that this model is very different
than anything that we have seen before
especially it is context aware of its
own context window. So building a system
like this uh is going to be very
different than um something like an open
weight model. Right now um more and more
of these open weight models are becoming
agentic in nature. A really good example
is the uh M2 model or Kimik K2 or even
the latest Deepseek R1 model. However,
the way these models are trained are
very different from each other. The way
they do uh agentic tool calling is very
different. So a generic system is not
going to be optimal. And that is the
reason that we are seeing this trend uh
that the open model uh open weight model
creators are trying to build their own
CLIs tool to actually show their real
capabilities. So just to show you an
example here is an output from Miniax M2
which is the latest uh arguably the best
coding agent that is open weight. Now it
does something very interesting during
its reasoning or thought process. There
are interled function calls or tool
calls. So the way it works is it's going
to think for a few seconds and it's able
to use um some tools within its thinking
budget or thinking traces and then it
can continue thinking but not every
coding agent out there supports this. So
for example open router is a very widely
used systems for trying openweight
models because you have a number of
different uh third party providers. Now
they implemented
uh this preserving reading blocks within
open router uh by using the cloud API
specifications.
But turns out for M2 specifically this
is broken. So here is Skyler who is part
of the um Minia Max team. He's actually
the head of engineering. He said uh we
strongly recommend you to pass the
thinking back manually by uh this
revisiting details open router uh
provider right so if you see the open
router was not able to support it now if
you are using a subsequent openweight uh
or open source system but you're calling
miniax m2 through open router the system
is going to be broken so apart from um
these open-source uh agent coding system
not being well optimized for every model
that is going to come about uh there is
another issue as well. So if you go to
uh open router for every open weight
model out there you're going to see
there are a number of different
providers who are uh hosting this model.
If you go to M2 max uh you see there are
a number of different providers hosting
it in different configurations. Now all
of them are made equal. The uh moonshot
commun 2 team actually ran anal analysis
where they compared the uh tool call
capabilities of different providers on
open router against their own uh API and
they found that in some cases the
differences are drastic right so you
need to be very careful of which uh
model provider you select not all of
them are made equal okay so coming back
to the discussion of uh CLI based
agentic systems. I personally feel like
CLI based agentic systems are uh really
great for wipe coding for developers.
For non-developers, you have wipe coding
tools like lovable replet. But for
developers, the CLI based coding agents
are extremely useful. Now with wipe
coding comes concerns regarding the
quality of the code as well. And one of
the biggest one is security of the
generated code. So you want to use tools
to evaluate the security of your
codebase that is generated by AI agents
and this is where the sponsor of today's
video comes into play. So this video is
sponsored by sneak who are building a
number of different tools that lets you
evaluate the security of your AI
generated code. Now they're doing a
webinar securing wipe coding addressing
the security challenges of AI generated
code um on November 20th. It's open to
public and I highly recommend to attend
this if you are interested in learning
more about how to secure your AI
generated code. Also, if you are part of
the international information system
security certification consortium and
you sign up and attend this session with
your member ID, you're going to get one
continuing professional education
credit. Details are in the video
description. Now, back to the video. At
the end, I want to revisit IDs versus
CLI. I think you can make case for both
of these but we're going to see more and
more use for specifically CLI based
systems. Perfect example of this is
cloud code which is one of the most
powerful agentic coding system out
there. But what it makes it powerful is
access to some extremely simple yet very
effective tools like bash and
we're going to be seeing more and more
systems which uses these tools without
being interfered by the bloat that is
introduced by IDEAS and now since these
things are also available on the web I
think you can use the same setup to
reuse them now in relation to open
weight models my recommendation is to
always use first party API where
possible this is where you're going to
get the best configuration if that is
not possible possible. Try to host it
yourself if your resources allows you.
If that is not possible, then make sure
you test multiple different API
providers. Don't just go for the
cheapest or the fastest one. Second, if
an open weight model provider has their
own agentic tools such as CLI or
terminal based tool, then go for that.
that will give you the best possible
performance out of these systems rather
than relying on a third party
implementation. Especially for
openweight model models, I have seen
that the first party integrations and
tools are usually a lot more powerful
than if you use it with third parties.
So, for example, watch my video on Mini
Max M2 model if you want to see that in
action. Anyways, I hope you found this
video useful.
Interested in securing the AI generated code? checkout Snyk: https://snyk.plug.dev/a2h64Ve Learn why model makers are shipping terminal-first coding agents—from Claude Code to Qwen/Kimi/Minimax M2—and what that means for IDEs vs CLI. We compare tool-calling, reasoning traces, and provider differences, with practical tips on using first-party APIs for best results. Website: https://engineerprompt.ai/ My voice to text system: whryte.com RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag LINKS in the video: https://cognition.ai/blog/devin-sonnet-4-5-lessons-and-challenges#the-model-is-aware-of-its-context-window https://github.com/MoonshotAI/K2-Vendor-Verifier https://github.com/MoonshotAI/kimi-cli https://x.com/Kimi_Moonshot/status/1976926483319763130/photo/1 https://x.com/SkylerMiao7/status/1984079999981514933 https://x.com/Kimi_Moonshot/status/1984207737673359441 Let's Connect: 🦾 Discord: https://discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: https://ko-fi.com/promptengineering |🔴 Patreon: https://www.patreon.com/PromptEngineering 💼Consulting: https://calendly.com/engineerprompt/consulting-call 📧 Business Contact: engineerprompt@gmail.com Become Member: http://tinyurl.com/y5h28s6h 💻 Pre-configured localGPT VM: https://bit.ly/localGPT (use Code: PromptEngineering for 50% off). Signup for Newsletter, localgpt: https://tally.so/r/3y9bb0 TIMESTAMP 00:00 The Rise of CLI Agents 01:27 Open models are tricky 06:15 Provider matters 07:03 The Future of CLI Agent Systems 07:43 Sponsorship: Securing AI-Generated Code 08:20 IDEs vs. CLI Systems