Contraction Hierarchies
Architecture
Why Shortest Path Is Hard at Scale
Routing engines all hit the same wall. A road network is a graph. Intersections are nodes, road segments are edges, edge weights are travel time. The global road network has roughly 500 million nodes and 1.2 billion edges. Finding the shortest path between two arbitrary points requires an algorithm that does not explore every node.
Dijkstra's algorithm explores nodes in expanding concentric circles from the source. For a 50 km city route, it visits 50,000-100,000 nodes. For a cross-country route, it visits millions. At 58,000 route requests per second (5 billion per day), running Dijkstra on the full graph would require billions of node visits per second. Not feasible.
A* improves on Dijkstra by steering the search toward the destination using a heuristic (straight-line distance), reducing the search space by 3-5x. But the improvement is a constant factor. New York to Los Angeles still explores millions of nodes.
Contraction Hierarchies solved this in 2008. The insight: preprocess the graph offline to create a hierarchy where queries only need to explore ~500 nodes, regardless of route length. Trade hours of preprocessing for a 2,000x query speedup.
How Contraction Works
Basic Terms
| Term | Meaning | Example |
|---|---|---|
| Edge | A connection between nodes with a cost (distance or time) | A ---4--- B means cost(A,B) = 4 |
| Cost | Total weight of a path | A -> B -> C = 4 + 2 = 6 |
| Shortcut | A new edge added to replace a removed node's path | A -> B -> C (6) becomes A -> C (6) |
| Contraction | Removing a node from the graph (after adding required shortcuts) | Remove E, add D-F shortcut |
| Level | Order in which nodes are removed. Removed early = low level. Removed late = high level | E = level 0, G = level 5 |
Base Graph
A ---4--- B ---2--- C
| |
3 5
| |
D ---1--- E ---1--- F ---3--- G
Preprocessing: Complete Step-by-Step
Step 1: Remove E (least important, only 1 shortcut needed)
Neighbors: D, F. Path through E: D -> E -> F = 1 + 1 = 2. No alternative exists. Add shortcut: D-F (weight 2).
A ---4--- B ---2--- C
| |
3 5
| |
D -------2-------- F ---3--- G
Assign: E = level 0
Step 2: Remove D
Neighbors: A, F. Path through D: A -> D -> F = 3 + 2 = 5. Alternative: A -> B -> C -> F = 4 + 2 + 5 = 11 (worse). Add shortcut: A-F (weight 5).
A ---4--- B ---2--- C
\ |
\ 5
\ /
\-----5------ F ---3--- G
Assign: D = level 1
Step 3: Remove B
Neighbors: A, C. Path through B: A -> B -> C = 4 + 2 = 6. Alternative: A -> F -> C = 5 + 5 = 10 (worse). Add shortcut: A-C (weight 6).
A -------6-------- C
\ |
\ 5
\ /
\----5----- F ---3--- G
Assign: B = level 2
Step 4: Remove C
Neighbors: A, F. Path through C: A -> C -> F = 6 + 5 = 11. Alternative: A -> F = 5 (better). No shortcut needed.
A --------5-------- F ---3--- G
Assign: C = level 3
Step 5: Remove F
Neighbors: A, G. Path through F: A -> F -> G = 5 + 3 = 8. No alternative exists. Add shortcut: A-G (weight 8).
A --------8-------- G
Assign: F = level 4
Step 6: Remove G
No neighbors left. Assign: G = level 5.
Final levels:
| Node | Level | Importance |
|---|---|---|
| E | 0 | Lowest (removed first) |
| D | 1 | |
| B | 2 | |
| C | 3 | |
| F | 4 | |
| G | 5 | Highest (removed last) |
Key insight: Shortcuts are reused in later steps. The D-F shortcut (from step 1) was used to create the A-F shortcut (in step 2). Shortcuts behave exactly like original edges during subsequent contractions.
Query Phase
Goal: shortest path A -> G. Use original edges + all shortcuts + the "only move to higher-level nodes" rule.
Forward search (from A):
| Expand | Reach | Cost | Via |
|---|---|---|---|
| A | B | 4 | original edge |
| A | F | 5 | shortcut A-F |
| B | C | 6 | original edge |
| F | G | 8 | original edge |
Forward distances: A=0, B=4, F=5, C=6, G=8
Backward search (from G):
| Expand | Reach | Cost | Via |
|---|---|---|---|
| G | F | 3 | original edge |
| F | A | 8 | shortcut F-A |
Backward distances: G=0, F=3, A=8
Meeting point: Both searches reached F and A.
| Common node | Forward + Backward | Total |
|---|---|---|
| F | 5 + 3 | 8 |
| A | 0 + 8 | 8 |
Answer: cost = 8, path = A -> F -> G
Path unpacking: A -> F is a shortcut (from step 2). Unpack: A -> D -> F. D -> F is a shortcut (from step 1). Unpack: D -> E -> F. Full path: A -> D -> E -> F -> G (cost 3 + 1 + 1 + 3 = 8).
Why This Works
- Shortcuts preserve shortest distances exactly. No information is lost.
- The graph becomes "compressed." 7 nodes and 7 edges become 2 nodes and 1 shortcut for the query.
- Queries skip low-level nodes entirely. Forward and backward searches climb the hierarchy and meet quickly.
- On the global road network (500M nodes), this same process produces a hierarchy where queries explore ~500 nodes total, regardless of route length.
Performance Numbers
| Metric | Dijkstra | A* | Contraction Hierarchies |
|---|---|---|---|
| Preprocessing | None | None | ~10 min (Germany), 4-6 hours (global) |
| Nodes explored per query | ~2,000,000 | ~500,000 | ~500 |
| Query time | ~2 seconds | ~400ms | < 1ms |
| Speedup over Dijkstra | 1x | 3-5x | ~2,000x |
Real-Time Traffic with CH
The Problem: Static Weights Break
CH preprocessing bakes edge weights into shortcuts. When traffic changes, those shortcuts become wrong.
Concrete example:
During preprocessing, the edge A->B has weight 5 (minutes). The shortcut A->C is created with weight 7 (A->B + B->C = 5 + 2).
Normal: A ---5--- B ---2--- C → Shortcut A-C = 7
Traffic: A --10--- B ---2--- C → Real cost = 12, but shortcut still says 7
The shortcut says A->C costs 7 minutes, but the actual cost is now 12. The routing engine returns a wrong answer. At 4-6 hours per full CH reprocessing, rebuilding every time traffic changes is not viable.
Solution 1: CCH (Customizable Contraction Hierarchies)
The key idea: separate structure (graph shape, node ordering, which shortcuts exist) from weights (edge costs).
Step 1: Preprocess once (weekly). Build the hierarchy using only graph topology. Determine node ordering, create shortcut structure. Ignore edge weights entirely. This takes 4-6 hours for the global graph.
Step 2: Store shortcut structure. After preprocessing, we know that shortcut A->C exists and passes through B. But we do not fix its weight permanently.
Step 3: Real-time weight update (every 30 seconds). When traffic changes edge A->B from 5 to 10:
- Identify which shortcuts contain edge A->B
- Recompute only those shortcut weights: A->C = 10 + 2 = 12
- Propagate upward through the hierarchy to update any higher-level shortcuts that depend on A->C
Only the affected portion of the graph is updated. Most shortcuts are untouched.
Performance:
- Local update (a few thousand segments): ~100ms
- Full weight refresh (all shortcuts): ~2 seconds
- No full reprocessing needed
The core insight: CH structure (levels + shortcut edges) stays the same. Only weights change. Think of it as a skeleton (CH) with live blood flow (traffic updates).
Solution 2: Multiple CH Graphs
A simpler approach: precompute separate CH graphs for different traffic scenarios.
Graph 1: free-flow (3 AM weights)
Graph 2: rush hour (8 AM weights)
Graph 3: weekend (Saturday weights)
At runtime, select the graph matching current conditions. Simple to implement, but cannot handle real-time changes (only switches between pre-built scenarios). Limited to a small number of profiles.
Solution 3: CRP (Customizable Route Planning)
Developed by Microsoft Research, used in Bing Maps. The idea: split the graph into geographic cells and precompute shortest paths between cell boundaries.
A query becomes: route within source cell, cross cell boundaries using precomputed distances, route within destination cell.
Advantage: Supports multiple routing metrics (fastest, shortest, eco-friendly, truck-safe) from a single preprocessing pass. CCH needs separate weight sets per metric.
Tradeoff: Queries are slightly slower (2-5ms vs < 1ms for CH), but metric flexibility is much higher. For a platform supporting only driving/walking/cycling, CCH is the better fit. For logistics platforms with dozens of vehicle profiles, CRP is worth evaluating.
Comparison
| Approach | Weight updates | Metric flexibility | Query speed | Used by |
|---|---|---|---|---|
| Static CH | Requires full reprocessing | One metric per build | < 1ms | OSRM (default mode) |
| CCH | ~2s full, ~100ms local | One metric per weight set | < 1ms | Google Maps |
| Multi-CH | Switch between pre-built graphs | Limited scenarios | < 1ms | Simpler deployments |
| CRP | ~2s full | Multiple metrics from one build | 2-5ms | Bing Maps |
When to Rebuild vs When to Update
Full CH reprocessing required (topology changed):
- New roads added (flyover, highway extension)
- Intersections changed or deleted
- Road classification changed (residential upgraded to primary)
- Trigger: weekly batch pipeline
CCH weight update sufficient (only costs changed):
- Traffic congestion
- Speed limit correction from probe data
- Temporary road blockage
- Construction zone
- Trigger: every 30 seconds from Redis traffic data
Real-world effect:
- New road detected from satellite imagery: visible in routing in ~1 week (next rebuild)
- Traffic speed change: visible in routing in ~30 seconds (CCH update)
Production Deployment Pattern
The global graph (~500M nodes) is split into ~50 regional shards, each 5-12 GB. Western Europe, Eastern US, Southeast Asia, and so on. Each routing server memory-maps one or more shards.
Routes that cross region boundaries use a transit node overlay: precomputed shortest-path distances between border nodes, stitched with 3 CH queries. Total cross-region query time: ~3-5ms.
Weekly rebuild pipeline. Every Sunday at 02:00 UTC: export updated road data from PostgreSQL (new roads from satellite detection, probe-confirmed candidates, OSM edits), recompute node importance ordering, run full CH preprocessing, validate against 10,000 known routes, canary deploy to 5% of servers, then full rollout. Old graph binary retained 24 hours for rollback.
Zero-downtime deployment. The new graph binary is uploaded to S3. Each routing server downloads it and swaps the mmap file pointer. In-flight queries complete against the old graph. New queries use the new graph. No server restart needed.
CCH weight updates. Every 30 seconds, the routing server reads updated segment speeds from Redis (populated by the Flink traffic pipeline). Changed edge weights trigger bottom-up shortcut recalculation. This typically takes < 500ms for a batch of 10,000 changed segments.
When Things Go Wrong
Stale topology. A new road detected from satellite imagery does not appear in routing until the next weekly graph rebuild. Workaround: for critical changes (highway openings, road closures), apply CCH weight overrides immediately. Set a closed road's weight to infinity. Set a new road's weight to its estimated travel time. These overrides persist until the next full rebuild incorporates the topology change.
Weight update lag. Between CCH weight update cycles (30-second intervals), routing uses the previous cycle's weights. During sudden traffic changes (accident, concert ending), routes may be suboptimal for up to 30 seconds. For most users, this lag is imperceptible.
Memory pressure. Shortcuts can 2-3x the edge count. On a server loading multiple regional shards, total memory for graph data can reach 20-30 GB. Memory-mapped files mitigate this: the OS only pages in the actively queried portions. Cold regions (rural areas rarely queried) stay on disk. Monitor page fault rates to ensure hot data fits in physical RAM.
Key Points
- •CH works by contracting nodes in order of increasing importance. Contracting a node means removing it from the graph and adding shortcut edges between its neighbors if the shortest path between those neighbors went through the removed node. The 'witness path' check determines whether a shortcut is needed: if an alternative path exists that is equally short or shorter, no shortcut is added.
- •After preprocessing, every shortest path in the graph can be decomposed into an upward segment (source to a high-importance node) followed by a downward segment (high-importance node to destination). This is the hierarchical property. A query runs bidirectional search: forward from source going only upward, backward from destination going only upward. They meet at the highest-importance node on the path.
- •The query explores approximately 500 nodes total, regardless of whether the route is 5 km or 5,000 km. This is because both searches climb the hierarchy quickly: residential streets to arterials to highways to motorways. The hierarchy converges at a small number of high-importance nodes (major highway interchanges) that form the backbone of long-distance routing.
- •Customizable Contraction Hierarchies (CCH) solve the dynamic weight problem. CCH separates the hierarchy structure (node ordering, which shortcuts exist) from the edge weights. The structure is preprocessed once based on graph topology. When traffic changes edge weights, only the shortcut weights need recalculation. A bottom-up propagation updates all affected shortcuts in ~2 seconds for the full graph, or ~100ms for localized changes. This is how Google Maps reflects live traffic without re-running 4-6 hour preprocessing.
- •Node importance ordering is the most critical preprocessing decision. The standard heuristic considers: edge difference (how many shortcuts would be added vs edges removed), contracted neighbors (avoid creating shortcut chains), and original edge count. Google's variant also incorporates traffic volume: high-traffic corridors get higher importance, which improves query performance for popular routes.
- •The preprocessed graph is deployed as a memory-mapped binary file. Each routing server loads the file via mmap. The OS handles paging: frequently queried regions (urban areas) stay in RAM, rarely queried regions (rural) stay on disk until accessed. No deserialization step on server startup. A new graph version is deployed by writing a new file to S3 and swapping the mmap pointer, achieving zero-downtime deployment.
Used By
Common Mistakes
- ✗Running Dijkstra or A* on the full graph at scale. On a 500M-node road network, Dijkstra explores ~2 million nodes per query (~2 seconds). At 58,000 queries/sec, that is 116 billion node visits per second. CH reduces this to ~500 nodes per query, making the same workload feasible on 6-18 servers.
- ✗Treating the CH graph as immutable between weekly rebuilds. Traffic changes edge weights constantly. Without CCH weight updates, the routing engine uses stale weights and produces suboptimal routes. The fix: run CCH weight propagation every 30 seconds from live traffic data in Redis.
- ✗Underestimating shortcut memory. Contracting a 500M-node graph can produce 1.5-2B shortcut edges on top of 1.2B original edges. Each shortcut stores a target node (8 bytes), weight (4 bytes), and middle node for unpacking (8 bytes). That is 30-40 GB of shortcuts alone. Regional sharding (5-12 GB per shard) keeps per-server memory manageable.
- ✗Reprocessing the full graph for every change. A single new road does not require 4-6 hours of CH preprocessing. Minor topology changes can be handled by temporarily marking edges as blocked or open via CCH weight overrides (set weight to infinity for closed roads, normal weight for opened roads). Reserve full reprocessing for when cumulative topology changes exceed a threshold.
- ✗Ignoring cross-region routing. When the graph is sharded by geographic region, routes that cross shard boundaries need a stitching mechanism. The transit node overlay pattern precomputes shortest-path distances between border nodes across regions. A cross-region route becomes 3 CH queries (origin shard + overlay + destination shard), adding ~3-5ms total.