Professional Development – 2026 – Week 11

Software Engineering

The Truth About Developer Productivity in the AI Age

Activity is not connected to system-level outcomes.
Goodhart’s law — when a measure becomes a target, it ceases to be a good measure
Activity metric — measures behavior, easy to measure, low value
Output metric — measures deliverables, somewhat easy to measure, some value
Outcome metric — measures system changes, hard to measure, high value
Activity and output metrics aren’t tied to value stream outcomes, offer limited data, and are easily gamed.
In the 2010s Nicole Forsgren DevOps Report was a shift. We’re now looking at deployment frequency, deployment lead time, rework rate, change failure rate, time to restore. This turned into DORA, and then Accelerate was published.
GenAI comes with privacy risks, vulnerabilities, environmental costs, and market concentration within only a few tech companies. It is transforming the globe and has accelerated development, which means companies are going back to activity and output metrics.
DX’s AI-assisted engineering: Q4 impact report…
- Good: Has a large sample, includes different companies, explains methodologies
- Issues: Unsure if data is randomly sampled, self-selected, stratified, or weight adjusted. This means we don’t know if this is representative of the industry.
- Tool interaction… Does not equate to strategic integration of AI.
- Time saved per week with AI tools… Activity metric, and self-reported (relies on memory, uses counterfactual reasoning, and has subjective attribution). Easily gamed.
- % code AI-authored… Activity metric, hard to measure accurately, definition was subjective, self-reported.
- PR throughput… Output metric. The report claims a correlation between frequency of AI usage and PR throughput. There’s no mention if it’s a statistically significant correlation. PR throughput is not a measure of delivery or productivity (e.g., 100 PRs a month but only one deployment because of manual tests; trunk-based development = 0 PRs but many deployments per day). You can’t prove it leads to better business results. This one metric tells you nothing about deployments or service reliability.
Now what?
- Recognize that AI-driven hyper-measurement of productivity is Taylorism all over again.
- The team is the unit of delivery, not an individual.
- Do your own critical thinking.
- Tool-accelerated activity != real business value
- Read Accelerate and Modern Software Engineering and implement them in your org.
Deployment / service reliability / technical quality numbers improving? Good technology outcomes!
Time to value / total cost of ownership across all teams improving? Good!
Business outcomes that matter most to your org improving? Good!
Tech capabilities are leading indicators of success. Look for more TDD, more loosely coupled architectures, and more stable CI. Add AI activity metrics and then see if there’s a correlation between AI usage and capability improvement.

Geoff Mazeroff

People and team builder, process improvement leader, life-long learner

Professional Development – 2026 – Week 11

Software Engineering

Leave a Reply