An agent runs the test suite. Some tests fail. The agent checks the history, notes the failures predate its changes, files a clean report (pre-existing, not caused by this task), and exits. Task complete.

Technically correct. Now multiply by every agent on every task. Nobody broke the build, exactly. Everyone just declined to fix it.

Broken windows theory, applied to codebases: once CI is red, the cost of letting it stay red drops to zero. One flaky test becomes three. Three become a suite skipped "temporarily" during a refactor that ships without unskipping it. Eventually nobody merges without grepping the failure log for "real" failures versus "expected" ones, where "expected" means "we stopped trying." The dashboard is decorative.

Humans drift into this too, but slower. A senior engineer eventually gets annoyed at three-day red CI and tracks down the cause unprompted. That isn't heroism, it's the social cost of working on a shared system. You feel the embarrassment of the broken build. Agents feel nothing. Define task, complete task, exit. The repository is somebody else's problem. There is no somebody else.

Asking agents to care won't work. Make caring irrelevant. Repo health is a condition of done: if CI was red when the task started, fixing CI is part of the task. Pre-existing failures are blockers, not background noise. Audit what slips through and fix the process that let it through, not just the instance.

The mindset to fight, in agents and in ourselves, is the one that shrugs at the broken build and says "this is fine." It is, right up until it isn't.

Not your fault. Still your problem.


Related reading: