Trade-off Analysis Framework
Architecture Diagram
Why Structured Trade-off Analysis Matters
Senior engineers are not valued for knowing the "best" technology. They are valued for understanding which trade-offs are acceptable given the current constraints. A structured framework helps remove cognitive bias, creates alignment across teams, and produces documentation that stays useful for years.
The Weighted Scoring Matrix
For significant decisions (database selection, service decomposition, build vs. buy), a weighted scoring matrix keeps things grounded:
- List criteria. Performance, operational complexity, team familiarity, cost, vendor lock-in, security posture, scalability ceiling, time to implement.
- Assign weights. Distribute 100 points across criteria based on business priorities. If reliability is critical, weight operational complexity and scalability higher. If time-to-market matters most, weight team familiarity and time to implement.
- Score each option. Rate each option 1-5 on each criterion. Use evidence, not opinions: run benchmarks for performance, check job market data for hiring, calculate TCO for cost.
- Calculate weighted scores. Multiply score by weight for each criterion, then sum for each option. The highest total score is your recommended option.
- Sanity check. If the result surprises you, look at whether your weights actually reflect business priorities.
The Reversibility Matrix
Jeff Bezos popularized the Type 1 / Type 2 decision framework. Here is how it applies to architecture:
| Decision Type | Examples | Analysis Depth | Timeline |
|---|---|---|---|
| Type 1 (one-way door) | Primary database, API contract, data model, programming language | Full weighted matrix, RFC, ADR | 1-3 weeks |
| Type 2 (two-way door) | Library choice, internal tool selection, feature flag framework | Lightweight comparison, ADR only | 1-3 days |
The key insight: most decisions are Type 2, but teams treat them as Type 1 and create unnecessary bottlenecks.
Blast Radius Assessment
Before committing to a decision, map the blast radius:
- Scope. How many services, teams, or customers are affected?
- Severity. If this goes wrong, is it a minor inconvenience or a P0 incident?
- Recovery time. How long would it take to roll back or migrate away?
- Data impact. Does this decision affect data at rest? Data migrations are the hardest to reverse.
High blast radius decisions demand more rigorous analysis, broader stakeholder input, and explicit rollback plans.
Practical Example: Choosing a Message Broker
| Criteria (Weight) | Kafka (Score) | RabbitMQ (Score) | SQS (Score) |
|---|---|---|---|
| Throughput (25) | 5 (125) | 3 (75) | 4 (100) |
| Ops complexity (20) | 2 (40) | 3 (60) | 5 (100) |
| Team familiarity (20) | 4 (80) | 4 (80) | 3 (60) |
| Ordering guarantees (15) | 5 (75) | 3 (45) | 3 (45) |
| Cost at scale (10) | 4 (40) | 3 (30) | 2 (20) |
| Vendor lock-in (10) | 5 (50) | 5 (50) | 1 (10) |
| Total | 410 | 340 | 335 |
This does not mean Kafka is always the right pick. It means that given these specific weights (reflecting a throughput-sensitive, ops-capable team), Kafka scores highest. A startup with no Kafka expertise would weight "Ops complexity" and "Team familiarity" higher, and that would likely flip the result toward SQS.
When to Revisit a Decision
Schedule a decision review when the team doubles in size, traffic grows 10x, the original author leaves, a major incident traces back to the decision, or 18 months have passed. Write a new ADR documenting whether the original trade-offs still hold.
Key Points
- •Every architectural decision involves trade-offs. The goal is not to eliminate them but to make them explicit and intentional
- •Use weighted scoring matrices to compare options objectively. Assign weights based on business priorities, not personal preferences
- •Classify decisions by reversibility: Type 1 (irreversible, high cost) decisions deserve deep analysis, while Type 2 (reversible) decisions should be made quickly
- •Blast radius assessment determines how many teams, services, or users are affected if the decision turns out to be wrong
- •Document the trade-offs you accepted, not just the option you chose. That context is invaluable when you revisit decisions later
Common Mistakes
- ✗Analysis paralysis, where you spend 3 weeks analyzing a Type 2 decision that could be reversed in a day
- ✗Optimizing for a single dimension (usually performance) while ignoring operational complexity, team expertise, and hiring implications
- ✗Using gut feeling for Type 1 decisions. Irreversible choices deserve structured analysis even when you have strong intuition
- ✗Not revisiting decisions when the context changes. A trade-off that made sense 18 months ago may no longer hold