Last week a backend engineer added a PR to my team’s review queue. They were shipping a feature into both our iOS and Android apps. They don’t work on mobile. They had Claude write both implementations, and Claude obliged with the worst possible version of each.

The wiki had integration guides for exactly this case. They didn’t live next to the code. No CLAUDE.md, no skill, nothing in the repo. Claude didn’t find them and the engineer didn’t think to look.

The framework already provided safe versions of what they were reinventing. Claude reinvented them anyway. Use the framework’s primitives and you get observability for free, alerting and monitoring as a side effect of writing the feature. They hand-rolled all of it instead. The bespoke layer means somebody now has to build and maintain the alerts and dashboards that the framework’s primitives would have parameterized for them. The unit tests reimplemented spies that the harness gives you automatically if you call the built-in helpers. They didn’t call the built-in helpers.

The whole thing took about five times as long as it should have. The diff was much bigger than it needed to be. None of it tripped a guardrail, because the code ran and the tests passed. The kind of cost that only shows up when someone who knows the domain reads the code, or when the feature ships and the missing alerts don’t fire.

The reflex is to say the engineer should have read more. Memorized the helpers, found the wiki, looked for the canonical pattern. That isn’t quite the lesson. The wiki, the framework, the mocking harness all assume the reader already has the rough shape of the system in their head. They tell you what to reach for. They don’t tell you there’s something to reach for, or what it would do if you did. What the engineer was missing wasn’t a specific helper. It was the cursory sense of the domain that would have made the gaps visible.

That cursory sense is also what lets you push back when AI output is plausible but wrong. The code Claude wrote ran. The tests passed. Nothing came back from the tools. The only check left was the one the engineer didn’t have.

#The other half

Domain owners have work to do too. Where the integration guide lives next to the code, in the repo, in a CLAUDE.md or AGENTS.md, in a skill, the agent finds it. Document the canonical pattern somewhere a tool can read, and the next person to touch the code lands closer to the right answer on the first try. The floor a new engineer arrives on is shaped by what the owners shipped alongside the code.

Not every stack gets the same investment. In my own org, some monorepos have rich scaffolding. Docs next to the code, skills, patterns documented in a way a tool can pick up. Others never invested in documentation, or worse, have outdated docs. The engineers who built the system are long gone. What people know is whether new code passes the mobile-architecture sniff test and whether the output metrics look OK. They’re looking at the system from outside. The shape of it isn’t available to them.

The contributor’s knowledge gap and the owner’s documentation gap meet somewhere in the middle, and AI sits on whatever floor is there. When the floor is high, an outsider with a model can land something reasonable. When the floor is low, the model just produces a confident wrong answer faster.

#The trade-off

Engineers make trade-offs constantly. Correctness for speed, depth for breadth, build for buy. Most of them are defensible. The one I won’t make is trading off knowledge for speed, on the assumption that the model has the knowledge for us.

The bar isn’t deep expertise. It’s cursory knowledge plus the willingness to keep building it. Knowing enough about the domain to recognize when the answer is shaped wrong. Knowing enough about the platform to ask whether you should be reinventing this in the first place. And the will to push back when the model is confidently wrong.

AI is real leverage. It buys back time. It opens up adjacent domains we used to skip entirely. None of that is in question. The point is that the leverage compounds with what you bring to it. With nothing under it, the leverage just makes the wrong thing arrive faster.

AI is not a tool that lets you turn your brain off. If anything it asks more of you, because verifying confident output in a domain you don’t know is harder than writing the code yourself would have been. The time it gives back is the time to keep learning. Expand the scopes you can verify in, not just the scopes you can ship into.