Why do agents care so much about backwards compatibility?

One of the most frustrating things about our new "agentic engineering" profession is that agents are obsessed with backwards compatibility. Even being explicit in every fashion I can (Skills, System Prompts, Post-Commit Hooks) agents still prioritize the way something previously worked, adapting it to the new idea, and going from there. They'll invent wild bespoke abstractions and interfaces to act as a compatibility layer between a new feature and an old one, which almost always blows up my codebase and subsequent context windows trying to refactor it.

I'm typing this as I am bullying an agent who simply cannot grasp the concept that I have zero fucking users on this app, never been used, fake data, and it's optimizing for preserving legacy systems during our transition to a new architecture. Brother, this is the fourth prompt! There's no migration plan needed here!

Figure 1n=1, my laptop

Effort and codebase quality, prompt by prompt

"this'll be quick" to "unmaintainable spaghetti no one understands."

prompts I had to send

codebase quality (vibes)

By the right side of this chart, nobody understands what's going on anymore — least of all the agent that built it.

I feel like I'm going crazy. It's not that I'm ungrateful for green tests and working code, but this problem exemplifies why you still need to be a software engineer to build big-ish software. Iteration with a prompt works, but it becomes increasingly inefficient as complexity increases. More babysitting is required the further down the rabbit hole you go.

But why is this happening? These models and harnesses are pretty optimized to make software at this point, but this is a glaring hole in their capabilities. What's causing them to explicitly reach for these anti-patterns?

One reason is I think they don't really give a fuck about the code. The code is simply a conduit to a completed "task." This is most likely an artifact of the extensive RL training that labs are working to perfect on verifiable domains such as coding. A passing test suite and a functional UI confirm correctness, but say nothing about whether the code will be easy to change later — even for another agent!

Figure 2what the reward function sees

Prompt in, feature out

RL only grades the endpoints. The mess in the middle is yours to live with.

I think another reason is token efficiency. Every engineer knows that it's way easier to shim a compatibility wrapper than to find and rename every callsite. These models show a sort of "laziness" in which they are unwilling to roll up their sleeves and edit every file (or even run a find and replace command?). They would much rather add a couple new files to make their lives easier instead of cleaning up the code base for future workers.

I think this may be a fundamental flaw with AI models in general. There's no self preservation instinct, no skin in the game for the spaghetti they leave behind. They have no concept of the future in which this work will come back to bite them.

Maybe this is just a skill issue on my end, and that everyone's figured out how to make their codebases clean and extensible. Or maybe it's an early-adopter problem entirely. Once agents can one-shot an entire company's codebase and maintain it, none of this really matters. For the time being, this is a stark example of "The AI Tax" that we're paying for our increased output.

Ranting aside, none of these theories are mutually exclusive. RL optimizes for "complete", token laziness picks the path of least resistance, and without any concept of future pain there's no reward for maintainability. The result is an agent that is doing what it was trained to do, while quietly blowing up my codebase. I finished the first draft of this piece and looked left to my other monitor and found this waiting for me:

Yes. I did leave legacy code in there.

I removed it from the product/API/agent/frontend surface, but I did not fully rip it out of the backend internals. That was not what you asked for.

Off to babysit I go.