Engineering Productivity Measurement
The Measurement Paradox
Engineering productivity is something every VP of Engineering wants to measure, and every attempt to boil it down to a single number falls flat. The reason is pretty fundamental: software engineering is a creative, collaborative activity. The most valuable work (thinking through edge cases, designing clean abstractions, mentoring teammates) is essentially invisible to any automated measurement system.
That doesn't mean measurement is hopeless. It just means you need to be deliberate about what you measure, why you're measuring it, and what you do with the data.
What McKinsey Got Wrong
In 2023, McKinsey published a framework for measuring developer productivity that recommended tracking "contribution analysis" and inner/outer loop metrics. The engineering community's reaction was fast and overwhelmingly negative. The framework leaned too heavily on activity metrics (lines of code equivalent, number of contributions) that are easy to game and don't connect to business outcomes. Kent Beck, Gergely Orosz, and other well-known voices pointed out that optimizing for measurable activity tends to come at the expense of the deep work that actually moves the needle.
The takeaway: any productivity metric that engineers can easily game will get gamed, especially when it's tied to compensation or performance reviews. Goodhart's Law is the iron rule of engineering metrics: "When a measure becomes a target, it ceases to be a good measure."
A Practical Measurement Approach
Rather than chasing a single productivity score, build a measurement practice around three categories:
System Metrics come straight from your toolchain and don't require any human effort to collect. PR cycle time (opened to merged), build duration and reliability, deployment frequency, code review turnaround, and time-to-onboard for new contributors. These metrics tell you about your engineering system's efficiency, not about individual programmer ability.
Developer Surveys capture what automated metrics can't. Run quarterly surveys asking things like: "How easy is it to get your changes into production?", "What's your biggest daily friction point?", "Do you have the tools you need?", "How often do you get uninterrupted focus time?" Look at results by team, not by individual. Watch the trends across quarters. A 10% satisfaction drop is an early warning that something needs attention.
Outcome Metrics tie engineering work back to business impact. Features shipped that moved product metrics. Incidents prevented by proactive reliability work. Time saved by internal tooling investments. Platform adoption by downstream teams. These are harder to attribute cleanly, but they're far more meaningful than counting activities.
Removing Friction Is the Best Investment
Research keeps showing the same thing: the highest-leverage productivity investment isn't better measurement. It's removing friction. Google's internal research found that slow builds were developers' number one complaint. Shopify's developer experience team found that improving CI reliability from 85% to 98% was equivalent to adding engineers in terms of effective output.
Start with a developer experience audit. Time how long it takes to clone the repo and run tests locally, open a PR and get it reviewed, deploy to staging, and deploy to production. These end-to-end cycle times reveal bottlenecks that no amount of individual productivity measurement would ever surface. Fix the bottlenecks and productivity follows, without the toxicity of individual scorecards.
Key Points
- •Developer productivity is multidimensional. No single metric captures it, and trying to creates perverse incentives
- •Combine system metrics (CI/CD data, code review stats) with developer surveys (satisfaction, friction points) for a complete picture
- •Proxy measures like PR cycle time and build reliability correlate with productivity but do not define it
- •McKinsey's 2023 developer productivity framework was widely criticized for over-indexing on activity metrics
- •The best productivity investment is usually removing friction (faster builds, fewer meetings, better tooling) rather than measuring output
Common Mistakes
- ✗Measuring lines of code, commit counts, or story points as productivity indicators. All are trivially gameable
- ✗Building elaborate dashboards before understanding what questions you are trying to answer
- ✗Comparing productivity across teams without accounting for codebase age, technical debt, and domain complexity
- ✗Treating developer experience improvements as overhead rather than productivity multipliers