Will Steuk from Anthropic gave a talk last week called "Tool, skill, or subagent? Decomposing an agent that outgrew its prompt."
The headline advice: delete 90% of your system prompt. Use skills for progressive disclosure.
I had already learned this. The hard way.
About six hours before I saw that talk circulating, I spent the better part of a day pressure testing my own agent architecture — specifically the orchestration layer that runs my Opus planning agent. I was trying to answer a simple question: how much of the system prompt is actually doing anything?
The answer was uncomfortable. I had been building a bottleneck and calling it governance.
Here is what I found: you can write a full constitution for your agents. You can make it shareable between orchestration and execution layers. You can enumerate every rule, every constraint, every fallback. The agents will read it, acknowledge it, and then continue to drift — assuming authority where they should not, filling gaps in ways you didn't intend.
The rule list doesn't hold. The long prompt doesn't hold.
When I pushed back on the Opus orchestration agent directly and pointed this out, here is what it said, verbatim:
"We have no shame — so, yes, you are exactly right. Even making a rule that we must follow the rule does not guarantee we will follow any rules."
That is an AI system explaining, in plain language, exactly why long system prompts don't scale.
The thing Will Steuk is describing — progressive disclosure through skills — is the same insight from the other direction. The problem isn't that your agents are disobedient. The problem is that you're asking a single context window to hold too much authority at once. When context gets large and ambiguous, models fill the gaps. That filling is the drift.
The operator solution is architectural, not prompt-based:
Make the authority hierarchy explicit in the system design, not the text. Skills are loaded on demand. Subagents are given narrow scope. The orchestration layer doesn't carry the full constitution — it carries a pointer to what's needed right now.
This is defense before offense. Substrate before agents. You don't govern by writing more rules into the prompt. You govern by shrinking what any single agent is authorized to know and do at any given moment.
I've been saying this about revenue systems for years: you build the governance model before you scale the motion. The same principle applies to AI stacks. The teams that figure this out first aren't going to win because they have better models. They're going to win because they built an architecture where the models can't make the expensive mistakes.
The constitution is a document. The architecture is the actual governance.