Rethinking Developer Productivity in the Age of AI Assistants

Lines of code per day was always a terrible productivity metric. With AI coding assistants, it’s become an absurd one. An engineer with Copilot or a similar tool can generate code at a pace that would have been unimaginable five years ago. If you’re measuring productivity by volume, every engineer just got a 3x raise. If you’re measuring it by value delivered, the picture is more complicated.

The more I’ve explored how AI assistants change engineering workflows, the more I’ve realised that the productivity conversation needs a fundamental reset. We’re not just doing the same work faster, we’re changing what the work is.

The Shift in Where Time Goes

Before AI assistants, a significant portion of an engineer’s day was spent on implementation, writing the code that turns a design into a working system. With AI assistants handling much of the boilerplate, scaffolding, and routine implementation, the time distribution shifts. Less time writing code. More time reviewing code. More time designing systems. More time testing. More time thinking about whether the thing being built is the right thing to build.

This is, in theory, a good shift. The highest-value engineering work has always been in the thinking, not the typing. But it creates a measurement problem: if your productivity metrics are based on implementation activity, they’ll show a team that’s “less productive” even as they’re delivering more value.

The leaders who focus on output volume, PRs merged, features shipped, velocity points, will see impressive numbers and miss the fact that the code being shipped is less well-understood, less well-tested, and carrying more hidden debt than before. The leaders who focus on outcome quality, user impact, system reliability, maintainability, will see a more honest picture.

The Review Burden

Here’s something that caught me off guard when I started paying attention: AI-generated code shifts the burden from creation to review. Code that’s generated in seconds still needs to be reviewed by a human who understands the context, the constraints, and the potential failure modes. And reviewing AI-generated code is harder than reviewing human-written code, because the patterns are often plausible but subtly wrong in ways that require careful attention to catch.

This means that the bottleneck in an AI-assisted workflow isn’t code generation, it’s code comprehension. The team’s productivity is limited not by how fast they can write code but by how fast they can understand, verify, and integrate code. That’s a fundamentally different constraint, and it requires different skills and different processes.

If your team is generating code faster than they can review it, you don’t have a productivity gain, you have a quality risk.

What to Measure Instead

The metrics that matter in an AI-augmented world are the ones that were always the right metrics, we just have even less excuse for not using them:

Cycle time from idea to validated outcome. Not just to deployment, but to confirmation that the deployed change actually achieved what it was meant to achieve.

Defect escape rate. How many issues make it to production? If AI is increasing code volume without a corresponding increase in review quality, this number will go up.

Time spent in review versus generation. If the ratio is shifting heavily toward generation, that’s a signal that review isn’t keeping pace.

Developer satisfaction and cognitive load. AI assistants can reduce tedium, but they can also increase cognitive load if engineers are spending more time verifying AI output than they would have spent writing the code themselves.

Rework rate. How often does AI-generated code need to be significantly revised after initial review? A high rework rate suggests the AI is creating work rather than saving it.

The Honest Conversation

The productivity gains from AI assistants are real, but they’re not as straightforward as the marketing suggests. The gains are largest for routine, well-understood tasks, boilerplate, standard patterns, documentation, test scaffolding. They’re smallest, and sometimes negative, for novel, complex, or context-dependent work where the AI doesn’t have enough information to generate correct code.

The honest conversation to have with your team and your stakeholders is: AI assistants make us faster at some things and create new risks in others. The net effect depends on how we adapt our processes, our review practices, and our measurement. Simply adding AI tools without changing anything else is not a productivity strategy, it’s a hope.

The teams that will genuinely benefit are the ones that rethink their workflows around the new capabilities, invest in review skills, and measure what matters rather than what’s easy to count.