comment 0

Professional Development – 2026 – Week 11

Software Engineering

The Truth About Developer Productivity in the AI Age

  • Activity is not connected to system-level outcomes.
  • Goodhart’s law — when a measure becomes a target, it ceases to be a good measure
  • Activity metric — measures behavior, easy to measure, low value
  • Output metric — measures deliverables, somewhat easy to measure, some value
  • Outcome metric — measures system changes, hard to measure, high value
  • Activity and output metrics aren’t tied to value stream outcomes, offer limited data, and are easily gamed.
  • In the 2010s Nicole Forsgren DevOps Report was a shift. We’re now looking at deployment frequency, deployment lead time, rework rate, change failure rate, time to restore. This turned into DORA, and then Accelerate was published.
  • GenAI comes with privacy risks, vulnerabilities, environmental costs, and market concentration within only a few tech companies. It is transforming the globe and has accelerated development, which means companies are going back to activity and output metrics.
  • DX’s AI-assisted engineering: Q4 impact report…
    • Good: Has a large sample, includes different companies, explains methodologies
    • Issues: Unsure if data is randomly sampled, self-selected, stratified, or weight adjusted. This means we don’t know if this is representative of the industry.
    • Tool interaction… Does not equate to strategic integration of AI.
    • Time saved per week with AI tools… Activity metric, and self-reported (relies on memory, uses counterfactual reasoning, and has subjective attribution). Easily gamed.
    • % code AI-authored… Activity metric, hard to measure accurately, definition was subjective, self-reported.
    • PR throughput… Output metric. The report claims a correlation between frequency of AI usage and PR throughput. There’s no mention if it’s a statistically significant correlation. PR throughput is not a measure of delivery or productivity (e.g., 100 PRs a month but only one deployment because of manual tests; trunk-based development = 0 PRs but many deployments per day). You can’t prove it leads to better business results. This one metric tells you nothing about deployments or service reliability.
  • Now what?
    • Recognize that AI-driven hyper-measurement of productivity is Taylorism all over again.
    • The team is the unit of delivery, not an individual.
    • Do your own critical thinking.
    • Tool-accelerated activity != real business value
    • Read Accelerate and Modern Software Engineering and implement them in your org.
  • Deployment / service reliability / technical quality numbers improving? Good technology outcomes!
  • Time to value / total cost of ownership across all teams improving? Good!
  • Business outcomes that matter most to your org improving? Good!
  • Tech capabilities are leading indicators of success. Look for more TDD, more loosely coupled architectures, and more stable CI. Add AI activity metrics and then see if there’s a correlation between AI usage and capability improvement.

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.