The Measurable Cost of Ritualistic Abstraction: 76%, 58%, 7-13, Why Hexagonal, DDD and DI Break Humans and AI Agents

There is a silent faith in the software industry that more layers is always better. Hexagonal, Clean Architecture, tactical DDD, strategic DDD, Saga, Event Sourcing, CQRS, the catalog of ritualistic architecture is sold as proof of seniority. To disagree is to confess immaturity. But three recent numbers, 76%, 58% and 7-13, are starting to put a price on that faith. And the price is not abstract. It is measured in lost accuracy, wasted context and burned human time to make an agent understand what should be trivial.

This post dives into what these numbers mean, why they are not accidental, and why the LLM's bias toward heavy abstractions is structural, inherited straight from the training corpus. It is not an anti-architecture recipe. It is a call to stop confusing ceremony with engineering, especially now that part of your team is an AI agent that pays, literally, in tokens and accuracy points, for every layer you added "just in case".

76%, The accuracy that evaporates when the dependency hides in DI/DIP

The first number comes from benchmarks of code agents trying to trace dependencies in projects that apply Dependency Injection with the Dependency Inversion Principle at every boundary. When the agent needs to answer "what does PaymentService actually call in production?", the correct answer requires following an interface, a binding in a container, a factory, and finally an implementation that may be in another module. The agent gets this right 76% of the time. Compared with 95%+ in direct code, it is a brutal drop, and it is a silent drop, because the agent keeps producing code that looks correct but that anchors to the wrong implementation.

The problem is not DI. It is ritualistic DI: applied as a default, without there being a second real implementer, without the test benefiting, without the boundary justifying the indirection. The practical rule is direct: if there is exactly one implementation, and it will keep being only one, the interface is decoration. For the human reading the code for the first time, it is a translation layer. For the agent, it is a hole where the truth disappears.

What to do in practice

Invert the assumption: the class goes concrete until a second real use case appears. Refactoring to an interface later costs minutes.
Keep the wiring visible: a magic container that does binding by convention is poison for any reader who does not have the framework in their head.
Note the "why" of the interface: if it exists, write down what the second implementer is (a test? an external adapter? a feature flag?). Without that note, the interface is debt disguised as a principle.

58%, When the agent prefers `glob+read` to the navigation tool

Second number: in 58% of tasks, modern agents ignore the code navigation tools provided (LSP, indexed symbols, codebase MCP) and fall back on the primitive heuristic of glob + read. The explanation is not model laziness. It is that ritualistic architecture fragments the context to the point where semantic indexing delivers a worse signal than a scan by name. When a feature is spread across domain/, application/, infrastructure/, interfaces/ and shared-kernel/, the result of a find symbol is a list of wrappers around wrappers. glob "**/Payment*" + reading the 4 files that show up is, empirically, faster and more correct.

This has a harsh implication for whoever designs a codebase: the best navigation tool is a file structure that makes navigation unnecessary. Co-location beats indexing. One directory per feature, with everything about the feature inside it, makes cd feature/ && ls more informative than any symbol tree. It is not primitivism, it is the recognition that the file system is a data structure that LLMs already master deeply, while navigation plugins depend on external tooling that can fail, be out of date, or simply not exist in the agent's environment.

7-13, The context budget your architecture is burning

Third number: in codebases with rigid separation into domain/app/infra, the agent needs to open between 7 and 13 files to understand a single feature end to end. Compare that with the pragmatic alternative, controller, service, repository in the same file, or at most three files co-located in features/payments/. The difference is an order of magnitude in context window consumption, and everything that enters the context displaces something that could be there: the real business rule, the relevant test, the bug's history.

Each extra file charges three tolls. (1) Tokens read: paid in latency and in cost. (2) Reasoning to reconcile the naming across layers, because PaymentDTO, PaymentEntity, PaymentModel and PaymentDomainObject typically represent the same thing with different names, and the agent burns cycles just to figure out that they are, indeed, the same entity. (3) The risk of the agent importing the wrong abstraction and producing code that compiles but violates the rule that lives in another layer. The three tolls are paid on every interaction. The multiplication accumulates until it becomes most of your company's monthly token bill.

Abstraction Bloat: why the LLM has a structural bias in favor of heavy architecture

Here is the point few people see. The LLM does not choose hexagonal architecture because it concluded it is the best. It chooses it because it was trained on a corpus where the overwhelming majority of the published technical material talks about Event Sourcing, CQRS, Saga, DDD and Hexagonal. No one publishes a Medium post called "How I have maintained a 3-file CRUD for 5 years serving 8 thousand requests per minute without pain". That post does not exist because it does not get engagement, but the corresponding system exists in production in thousands of companies that pay their bills every month.

The result is a statistical bias that becomes a normative bias: the model suggests the ceremonial structure because it is the structure visible in the corpus. Less experienced engineers confuse that suggestion with best practice and accept it without question. Experienced engineers hit the brakes, but the friction is constant, recurring, and it charges energy in every session. The cost of that bias is cumulative: every decision made in the shadow of the ceremonial default adds one more layer that will charge a toll forever.

The necessary reframe

Ceremony ≠ engineering. Engineering is solving the problem within the budget (of time, context, attention, money). Ceremony is executing the ritual regardless of the budget.
Architecture is local, not universal. Hexagonal can be right for the critical core of the product and absurd for the email notification service, and the two can coexist in the same repository without hypocrisy.
The agent is part of the team now. Architecture decisions have to consider the cognitive cost for humans and the context/accuracy cost for agents. Both pay.
Co-location defeats indexing. One directory per feature, with everything inside, beats any mental map of scattered layers. Deleting the feature should be deleting a directory, not a treasure hunt across five folders.

The inverted seniority test

For years the industry measured seniority by the ability to add abstraction. Whoever proposed Hexagonal was senior. Whoever proposed "service + db" was junior. The new test is the opposite. Senior is whoever can look at a low-risk feature that changes slowly and answer, without guilt, "three files solve it". Senior is whoever realizes that the layer they are about to add will charge a toll for ten years to solve a problem that may never happen. Senior is whoever accepts that the most mature system is typically the most boring.

The numbers from the slide, 76%, 58%, 7-13, are not saying that architecture is bad. They are saying that ritualistic architecture charges a price that can now be measured. Ignoring the price was already expensive. Continuing to ignore it, in a team where half of the interactions with the code go through an agent, is a conscious decision to pay a recurring tax to look senior instead of being senior.