Engineering Org Health Check
The Quarterly Framework
Set a fixed cadence. Quarterly works for most orgs because it's frequent enough to catch trends but not so frequent that data collection becomes a burden. Pick the same week each quarter. Assign someone to own the data gathering. Treat it like a recurring deliverable, not an afterthought.
Your health check should cover five dimensions: delivery performance, team stability, engineering quality, developer experience, and strategic alignment. Each dimension gets two to three metrics. That's 10-15 data points total. More than that and you'll spend all your time measuring instead of improving.
Signals That Matter
Delivery performance. Track cycle time (commit to production), deployment frequency, and planned-vs-delivered ratio per sprint. DORA metrics are the industry standard here. Elite teams deploy multiple times per day with less than one hour lead time. If your cycle time is measured in weeks, that's a structural problem.
Team stability. Voluntary attrition rate by quarter, broken down by level and tenure. Healthy engineering orgs see 8-12% annual voluntary attrition. Above 15% signals a systemic problem. Below 5% might mean people are comfortable but not growing. Track time-to-hire for open roles. If your average is above 60 days, your pipeline or interview process needs work.
Engineering quality. Incident count and severity trends. Change failure rate (percentage of deployments causing incidents). Mean time to recovery. These tell you whether your systems are getting more reliable or less.
Developer experience. Quarterly survey with NPS-style scoring. Key questions: "How productive do you feel day-to-day?" and "Would you recommend this team to a friend?" Track CI build times and environment setup time as objective measures.
Strategic alignment. What percentage of engineering effort went to strategic initiatives vs maintenance vs unplanned work? If more than 30% of capacity goes to unplanned work consistently, you have a quality or planning problem.
Red Flags vs Normal Variance
A single quarter with elevated attrition after a reorg is normal. Two consecutive quarters of elevated attrition is a problem. Sprint velocity dropping during a major migration is expected. Velocity staying depressed six months after the migration completed is a red flag.
Context matters enormously. Always pair quantitative data with qualitative signal from skip-level 1:1s, team retros, and manager observations.
Presenting to Executives
Executives don't want a dashboard walkthrough. They want answers to three questions: are we healthy, what's at risk, and what are you doing about it. Structure your readout as a narrative. Lead with the headline ("Engineering health is stable with one area of concern"). Show the trend data that supports it. Close with your action plan.
Use a simple red/yellow/green system for each dimension. Green means healthy. Yellow means monitoring. Red means active intervention needed. Executives can absorb a five-dimension status grid in ten seconds. They cannot absorb fifteen charts.
Action Planning
Every health check should produce zero to three action items. Not more. If everything is green, celebrate and move on. If something is yellow, define what would turn it green and assign an owner. If something is red, it goes to the top of your priority list for the next quarter.
Close the loop. Share the health check results with the full engineering org. Show what you found and what you're doing about it. Transparency builds trust and gives people confidence that leadership sees the same problems they do.
Key Points
- •Run health checks quarterly with consistent metrics so you can spot trends, not just snapshots
- •Attrition rate alone is misleading. Break it down by tenure band, level, and voluntary vs involuntary
- •Developer satisfaction surveys only work if you act on the results and close the loop publicly
- •Present health data to executives as narratives with context, not raw dashboards
- •Normal variance exists. Don't overreact to a single bad quarter, but do investigate two in a row
Common Mistakes
- ✗Tracking too many metrics and drowning in data instead of focusing on 5-7 signals that matter
- ✗Using velocity as a performance metric rather than a planning tool
- ✗Running surveys but never sharing results or taking visible action, which destroys future participation
- ✗Comparing your metrics to industry benchmarks without accounting for company stage and context