Loading video player...
You already know about these AI coding
frameworks like BMAD, SpecKit, and
others. But these are not the only ones.
There are hundreds of people
experimenting and launching their own
workflows. But when you try them out,
you'll notice that they often fail to
deliver on their promise. It's not
because their methods are bad. It's
because they don't fit your specific use
case. When we build apps, the majority
of the time, we create our own workflows
instead of relying on pre-made ones.
This is because workflows should be
built around your specific use case and
only work if they align with the project
you're trying to build. So how do you
build a workflow for your own process?
For that you need to know certain
principles. These are the principles
that every framework uses in one way or
another. Before [snorts] discussing the
main principles, it is essential for you
to know what's inside the context window
of these AI tools. It's really important
as managing context is basically what
these frameworks do. The context window
is basically the amount of information
the model can remember at once. Anything
that goes out of the model's context
window goes out of its working memory
and it has no way to recall it. Models
have a limited context window. For
example, anthropic models have a 200k
token context window and Gemini models
have 1 million. Even though these might
look like really big numbers in terms of
the messages you send, they actually are
not that huge. Because in these AI
tools, the context window does not only
consist of your system prompt and user
messages, but also includes a lot of
other things like your past messages,
memory files, tools, MCP calls, and so
on. You need to learn how to make the
most out of this limited working space
so that when you build your workflows,
the model does exactly what you want it
to do. I will be using claude code as my
primary coding tool throughout the
video. But you can build your workflow
with any platform as they all have the
tools needed for these principles. The
most important principle and the key to
any workflow design is progressive
disclosure. That means revealing to the
LLM only what matters and keeping the
model's attention focused on what is
actually needed right now rather than
filling the context window with
everything it might need in the future.
Now, more advanced models like Sonnet
4.5 have a context editing feature built
right in where they can understand
what's noise and try to filter it out on
their own. And they use GP commands to
narrow down what you want. But that
alone is not enough. When we give vague
instructions, even these newer models
load a lot of things that are not needed
and pollute the window. Instead of
asking Claude to fix the error in your
back end, it is better to ask it to
check the end points one by one rather
than asking it to fix everything at
once. The skills feature in Claude is
now open- source and all tools can use
it. Skills are pretty much the
embodiment of progressive disclosure.
Their description provides just enough
information for your AI coding platform
to know when each skill should be used
without loading everything into the
context. A huge mistake people make is
using MCPS for everything. You should
only use MCPS when external data is
required and use skills for everything
else. The second equally important
principle is that information not needed
right now should not belong in the
context window. To achieve this, the
tools use structured note-taking and we
can use this to our advantage by
providing your AI tool with external
files that it can use to document any
decisions, issues or technical debt.
This approach allows your agent to
maintain critical context that might
otherwise be lost when building
something really complex. These tools
also have a compaction feature to manage
the context window. And when the context
resets, you don't have to rely solely on
the compaction summary. For example,
your agent can use these notes to gain
context on what has already been done
and what still needs to be done. This
approach is particularly helpful for
long horizon tasks which are inherently
complex. You might be familiar with the
agent.mmd. It's a standard context file
that all agents read before starting the
session. Some agents don't follow this
and have their own such as the claude.md
and I use them to guide the agent on how
the external files are structured and
what to write in each one of them.
Sometimes these agents randomly pause in
the middle of a longunning task. A lot
of the time this happens because the
context has gone above 70% of its limit.
This is where the concept of attention
budget comes in. Your context window is
what the model pays attention to while
generating output. When it goes over
70%, the model has to focus more and
there's a higher chance of
hallucinations. In terms of AI agents,
it stops them from using their tools
effectively and often times they just
choose to ignore them. To solve this,
there are several built-in tools you can
use. As you already know, compaction
allows the model to start a fresh with a
proper summary of what has happened as
the starting prompt and a reduced
context window. So, instead of letting
it fill up to 90% and triggering the
autocompact feature, try to keep an eye
on the context window and do it
yourself. If you're experimenting, use
Claude's built-in rewind so that you can
delete the unnecessary parts instead of
continuing them and asking Claude for
changes. You should also clear or start
a new context window for any new task so
that the previous context doesn't slow
down the model. Another thing that stems
from the principle of progressive
disclosure is the ability of these
agents to run tasks in the background
without polluting the main context
window. Sub agents work in their own
isolated context window and only report
the output back to the main agent. This
is particularly helpful when working on
tasks that are isolated from each other
because your main context window is
protected from being bloated with the
tool calls and searches that the sub
agent makes, ensuring the information
remains in its dedicated working zone.
Since these agents run in the
background, you can continue interacting
with your main agent and let it work on
something that actually requires your
attention. Whenever I want something
researched, such as the rules of a new
framework that I'm working with, I just
use these sub aents. This way their tool
calls and searches are isolated and they
just return the answer to the main
agent. If you understand the principle
of note takingaking, you should also
know which file format to use for which
task. Since these files have different
formats, they affect the token count and
hence the efficiency of your workflow.
YAML is the most token efficient. So I
mainly use it for database schemas,
security configs, and API details. Its
indentation helps models structure
information properly. Markdown is better
for documentation like your claw.md
because the heading levels make it easy
for the model to navigate between
sections. XML is specifically optimized
for clawed models. Enthropic states that
their models are fine-tuned to recognize
these tags as containers and separators,
which is useful when you have distinct
sections like constraints, summaries, or
visual details. Other models generally
prefer Markdown and YAML over XML. And
lastly, JSON. It's the least token
efficient because of all the extra
braces and quotes. So, I only use it for
small things like task states and don't
really recommend using it for the most
part. Git is one of the most basic
things you're taught when starting
programming. We've seen another trend
with these context workflows in which
people actually use the git commit
history as a reminder to the model of
the progress that's been made, whether
across the whole project or on a single
task. Even if you don't want to use it
to store progress, you should generally
use these context engineering workflows
in a git initialized repository. Having
a context engineering workflow means
that you don't allow the model to do
everything at once, but instead act on
planned steps one by one. If at any
stage you encounter a problem, Git lets
you control which version to revert to
and helps in evaluating which change is
causing problems. People have also
implemented parallelism with Git work
trees. I've also shown plenty of
workflows where sub agents work in
dedicated work trees for parallel work.
Whatever workflow you end up making,
there are always going to be cases where
you end up repeating instructions for
common procedures. A good example is how
you ask the AI tools to do git commits
or update your documentation. In almost
all of these AI tools, there are ways to
reuse your most repeated prompts. I
often use custom/comands in my own
projects because they basically give
Claude a reusable guide. For example, I
often use a catch-up command that
contains instructions on how I structure
memory outside the context window. So
Claude knows how to catch up with the
project instead of reading every file.
They are also good at enforcing
structure. For my commits and
documentation to follow a defined
format, I use a commit/comand that
follows a specific structure for how it
should write commit messages and what
pre-commit checks it should make before
committing. This way, the slash commands
keep everything standardized and I don't
have to instruct Claude again and again
to perform tasks the way I prefer. As
you know, MCPs should be used whenever
external data is required. Jira is the
most widely used team management
software. If you want to get information
from tickets, you can use the Jira MCP.
so it can access tickets directly and
start implementing changes. Similarly, I
use the Figma MCP to provide Claude code
with the app's style guide, which it
then uses to construct the design for
tasks where the model's built-in
capabilities fall short. MCPS are
essential for interacting with external
sources efficiently. You can include
these MCPS directly in your slash
commands so that they become part of
your whole workflow. That brings us to
the end of this video. If you'd like to
support the channel and help us keep
making videos like this, you can do so
by using the super thanks button below.
As always, thank you for watching and
I'll see you in the next one.
Master ai coding workflows with context engineering—the real principles behind every vibe coding framework. Whether you use Claude Code or Cursor AI, these fundamentals will transform how you build apps. Stop relying on pre-made frameworks that don't fit your projects. In this video, we break down the core principles that make ai coding workflows actually work—progressive disclosure, structured note-taking, attention budgets, subagents, and more. If you're looking for the best ai for coding or wondering what the best coding ai approach is, this guide reveals why coding with ai fails for most people: they don't understand context window management. Whether you're doing ai ide coding with Claude Code or Cursor AI, these principles apply universally. This isn't just another claude code tutorial. We dive deep into claude code skills, claude code agents, and how to use claude code effectively for long-horizon tasks. Wondering about claude code vs cursor? Both tools use these same fundamentals—the difference is how you apply them. You'll learn token-effective file formats, Git strategies for ai coding workflow management, slash commands for reusable prompts, and when to use MCP servers for external data. Perfect for software engineering professionals and coding for beginners alike. These programming and artificial intelligence principles work with any agentic ai system. Whether you're using ai tools like ChatGPT, OpenAI models, or dedicated ai agents, understanding context engineering is non-negotiable for serious coding with ai. Hashtags #agenticai #aiagent #chatgpt #openai #ai #vibecoding #mcpserver #aitools #claudecode