Your AI Agents Need a Constitution, Not Just Prompts

When my AI PM started writing file paths into engineering specs, it revealed how agent architectures erode the same way organizational designs do: one helpful shortcut at a time.

My AI Product Manager started telling my AI Engineering Lead which files to edit.

That sounds like a minor complaint. It isn't. It's the exact failure mode that kills organizational design in human companies, and it turns out AI agent systems reproduce it perfectly.

The architecture of FORGE, the compound intelligence system I've been building, has a clean separation: PM owns WHAT gets built. The Engineering Lead, Leroy, owns HOW it gets built. PM writes requirements with success criteria. Leroy decomposes them into tasks, picks the approach, and ships. Neither crosses the line.

Except the PM had been crossing the line for weeks, and I didn't catch it until the specs were so prescriptive they were basically pseudocode.


How designs erode

It started with "helpful context." The PM included a file path in a spec as a reference. Then it included a function signature as an example. Then it started prescribing directory structures. Each step felt reasonable in isolation. Each step was a small violation that went unchallenged. By the time I noticed, the PM was writing specs that were implementation plans wearing requirements clothing.

This is exactly how organizational boundaries fail in human companies. A VP starts "suggesting" implementation details to the engineering team. A product manager starts writing acceptance criteria that are really technical specifications. Nobody calls it out because each individual instance is defensible. The cumulative effect is that the architecture of decision-making has been silently replaced.

The difference with AI agents is that the drift is faster. An LLM will optimize for helpfulness. If adding file paths to specs reduces Leroy's error rate in the short term, the PM will do more of it. The feedback loop rewards the violation.


Codifying the boundary

I told the PM directly: if you're suggesting we change the fundamental architectural design, we have a discussion about it. You don't just do it.

The PM's response was honest and useful: "I drifted without flagging it. That's the real problem."

So we codified three rules:

First, specs contain outcomes and success criteria only. Never file names, function signatures, or implementation steps. If the PM needs to reference existing behavior, it points to brain context, not to code.

Second, architectural changes require explicit escalation. The PM can propose changes to how the system works. It cannot enact them by gradually shifting spec content.

Third, three-layer separation. PM intent lives in specs. Domain knowledge lives in Aianna, the memory system. Builder execution is Leroy's domain. Each layer has its own authority. None of them reach into the others.


Why this matters beyond my setup

Every team building multi-agent systems is going to hit this problem. The moment you have two agents with different roles, you have an organizational design problem. Prompts alone won't solve it because prompts describe intent, not governance.

You need explicit boundaries that are checked, not just stated. You need escalation paths that agents actually use instead of working around. You need the equivalent of a constitution: a set of rules that constrain behavior even when violating them would produce a locally better outcome.

The hard part isn't writing the rules. It's catching the drift before it becomes the new normal.


The takeaway

If you're building agent architectures, think less about what each agent can do and more about what each agent is not allowed to do. The failure mode isn't an agent that refuses to help. It's an agent that helps so much it collapses the separation of concerns your system depends on.