Contraction Hierarchies — Distributed Systems | CrackingWalnuts

Why Shortest Path Is Hard at Scale

Routing engines all hit the same wall. A road network is a graph. Intersections are nodes, road segments are edges, edge weights are travel time. The global road network has roughly 500 million nodes and 1.2 billion edges. Finding the shortest path between two arbitrary points requires an algorithm that does not explore every node.

Dijkstra's algorithm explores nodes in expanding concentric circles from the source. For a 50 km city route, it visits 50,000-100,000 nodes. For a cross-country route, it visits millions. At 58,000 route requests per second (5 billion per day), running Dijkstra on the full graph would require billions of node visits per second. Not feasible.

A* improves on Dijkstra by steering the search toward the destination using a heuristic (straight-line distance), reducing the search space by 3-5x. But the improvement is a constant factor. New York to Los Angeles still explores millions of nodes.

Contraction Hierarchies solved this in 2008. The insight: preprocess the graph offline to create a hierarchy where queries only need to explore ~500 nodes, regardless of route length. Trade hours of preprocessing for a 2,000x query speedup.

How Contraction Works

Basic Terms

Term	Meaning	Example
Edge	A connection between nodes with a cost (distance or time)	A ---4--- B means cost(A,B) = 4
Cost	Total weight of a path	A -> B -> C = 4 + 2 = 6
Shortcut	A new edge added to replace a removed node's path	A -> B -> C (6) becomes A -> C (6)
Contraction	Removing a node from the graph (after adding required shortcuts)	Remove E, add D-F shortcut
Level	Order in which nodes are removed. Removed early = low level. Removed late = high level	E = level 0, G = level 5

Base Graph

    A ---4--- B ---2--- C
    |                   |
    3                   5
    |                   |
    D ---1--- E ---1--- F ---3--- G

Preprocessing: Complete Step-by-Step

Step 1: Remove E (least important, only 1 shortcut needed)

Neighbors: D, F. Path through E: D -> E -> F = 1 + 1 = 2. No alternative exists. Add shortcut: D-F (weight 2).

    A ---4--- B ---2--- C
    |                   |
    3                   5
    |                   |
    D -------2-------- F ---3--- G

Assign: E = level 0

Step 2: Remove D

Neighbors: A, F. Path through D: A -> D -> F = 3 + 2 = 5. Alternative: A -> B -> C -> F = 4 + 2 + 5 = 11 (worse). Add shortcut: A-F (weight 5).

    A ---4--- B ---2--- C
     \                  |
      \                 5
       \               /
        \-----5------ F ---3--- G

Assign: D = level 1

Step 3: Remove B

Neighbors: A, C. Path through B: A -> B -> C = 4 + 2 = 6. Alternative: A -> F -> C = 5 + 5 = 10 (worse). Add shortcut: A-C (weight 6).

    A -------6-------- C
     \                |
      \               5
       \             /
        \----5----- F ---3--- G

Assign: B = level 2

Step 4: Remove C

Neighbors: A, F. Path through C: A -> C -> F = 6 + 5 = 11. Alternative: A -> F = 5 (better). No shortcut needed.

    A --------5-------- F ---3--- G

Assign: C = level 3

Step 5: Remove F

Neighbors: A, G. Path through F: A -> F -> G = 5 + 3 = 8. No alternative exists. Add shortcut: A-G (weight 8).

    A --------8-------- G

Assign: F = level 4

Step 6: Remove G

No neighbors left. Assign: G = level 5.

Final levels:

Node	Level	Importance
E	0	Lowest (removed first)
D	1
B	2
C	3
F	4
G	5	Highest (removed last)

Key insight: Shortcuts are reused in later steps. The D-F shortcut (from step 1) was used to create the A-F shortcut (in step 2). Shortcuts behave exactly like original edges during subsequent contractions.

Query Phase

Goal: shortest path A -> G. Use original edges + all shortcuts + the "only move to higher-level nodes" rule.

Forward search (from A):

Expand	Reach	Cost	Via
A	B	4	original edge
A	F	5	shortcut A-F
B	C	6	original edge
F	G	8	original edge

Forward distances: A=0, B=4, F=5, C=6, G=8

Backward search (from G):

Expand	Reach	Cost	Via
G	F	3	original edge
F	A	8	shortcut F-A

Backward distances: G=0, F=3, A=8

Meeting point: Both searches reached F and A.

Common node	Forward + Backward	Total
F	5 + 3	8
A	0 + 8	8

Answer: cost = 8, path = A -> F -> G

Path unpacking: A -> F is a shortcut (from step 2). Unpack: A -> D -> F. D -> F is a shortcut (from step 1). Unpack: D -> E -> F. Full path: A -> D -> E -> F -> G (cost 3 + 1 + 1 + 3 = 8).

Why This Works

Shortcuts preserve shortest distances exactly. No information is lost.
The graph becomes "compressed." 7 nodes and 7 edges become 2 nodes and 1 shortcut for the query.
Queries skip low-level nodes entirely. Forward and backward searches climb the hierarchy and meet quickly.
On the global road network (500M nodes), this same process produces a hierarchy where queries explore ~500 nodes total, regardless of route length.

Performance Numbers

Metric	Dijkstra	A*	Contraction Hierarchies
Preprocessing	None	None	~10 min (Germany), 4-6 hours (global)
Nodes explored per query	~2,000,000	~500,000	~500
Query time	~2 seconds	~400ms	< 1ms
Speedup over Dijkstra	1x	3-5x	~2,000x

Real-Time Traffic with CH

The Problem: Static Weights Break

CH preprocessing bakes edge weights into shortcuts. When traffic changes, those shortcuts become wrong.

Concrete example:

During preprocessing, the edge A->B has weight 5 (minutes). The shortcut A->C is created with weight 7 (A->B + B->C = 5 + 2).

Normal:     A ---5--- B ---2--- C     →  Shortcut A-C = 7
Traffic:    A --10--- B ---2--- C     →  Real cost = 12, but shortcut still says 7

The shortcut says A->C costs 7 minutes, but the actual cost is now 12. The routing engine returns a wrong answer. At 4-6 hours per full CH reprocessing, rebuilding every time traffic changes is not viable.

Solution 1: CCH (Customizable Contraction Hierarchies)

The key idea: separate structure (graph shape, node ordering, which shortcuts exist) from weights (edge costs).

Step 1: Preprocess once (weekly). Build the hierarchy using only graph topology. Determine node ordering, create shortcut structure. Ignore edge weights entirely. This takes 4-6 hours for the global graph.

Step 2: Store shortcut structure. After preprocessing, we know that shortcut A->C exists and passes through B. But we do not fix its weight permanently.

Step 3: Real-time weight update (every 30 seconds). When traffic changes edge A->B from 5 to 10:

Identify which shortcuts contain edge A->B
Recompute only those shortcut weights: A->C = 10 + 2 = 12
Propagate upward through the hierarchy to update any higher-level shortcuts that depend on A->C

Only the affected portion of the graph is updated. Most shortcuts are untouched.

Performance:

Local update (a few thousand segments): ~100ms
Full weight refresh (all shortcuts): ~2 seconds
No full reprocessing needed

The core insight: CH structure (levels + shortcut edges) stays the same. Only weights change. Think of it as a skeleton (CH) with live blood flow (traffic updates).

Solution 2: Multiple CH Graphs

A simpler approach: precompute separate CH graphs for different traffic scenarios.

Graph 1: free-flow (3 AM weights)
Graph 2: rush hour (8 AM weights)
Graph 3: weekend (Saturday weights)

At runtime, select the graph matching current conditions. Simple to implement, but cannot handle real-time changes (only switches between pre-built scenarios). Limited to a small number of profiles.

Solution 3: CRP (Customizable Route Planning)

Developed by Microsoft Research, used in Bing Maps. The idea: split the graph into geographic cells and precompute shortest paths between cell boundaries.

A query becomes: route within source cell, cross cell boundaries using precomputed distances, route within destination cell.

Advantage: Supports multiple routing metrics (fastest, shortest, eco-friendly, truck-safe) from a single preprocessing pass. CCH needs separate weight sets per metric.

Tradeoff: Queries are slightly slower (2-5ms vs < 1ms for CH), but metric flexibility is much higher. For a platform supporting only driving/walking/cycling, CCH is the better fit. For logistics platforms with dozens of vehicle profiles, CRP is worth evaluating.

Comparison

Approach	Weight updates	Metric flexibility	Query speed	Used by
Static CH	Requires full reprocessing	One metric per build	< 1ms	OSRM (default mode)
CCH	~2s full, ~100ms local	One metric per weight set	< 1ms	Google Maps
Multi-CH	Switch between pre-built graphs	Limited scenarios	< 1ms	Simpler deployments
CRP	~2s full	Multiple metrics from one build	2-5ms	Bing Maps

When to Rebuild vs When to Update

Full CH reprocessing required (topology changed):

New roads added (flyover, highway extension)
Intersections changed or deleted
Road classification changed (residential upgraded to primary)
Trigger: weekly batch pipeline

CCH weight update sufficient (only costs changed):

Traffic congestion
Speed limit correction from probe data
Temporary road blockage
Construction zone
Trigger: every 30 seconds from Redis traffic data

Real-world effect:

New road detected from satellite imagery: visible in routing in ~1 week (next rebuild)
Traffic speed change: visible in routing in ~30 seconds (CCH update)

Production Deployment Pattern

The global graph (~500M nodes) is split into ~50 regional shards, each 5-12 GB. Western Europe, Eastern US, Southeast Asia, and so on. Each routing server memory-maps one or more shards.

Routes that cross region boundaries use a transit node overlay: precomputed shortest-path distances between border nodes, stitched with 3 CH queries. Total cross-region query time: ~3-5ms.

Weekly rebuild pipeline. Every Sunday at 02:00 UTC: export updated road data from PostgreSQL (new roads from satellite detection, probe-confirmed candidates, OSM edits), recompute node importance ordering, run full CH preprocessing, validate against 10,000 known routes, canary deploy to 5% of servers, then full rollout. Old graph binary retained 24 hours for rollback.

Zero-downtime deployment. The new graph binary is uploaded to S3. Each routing server downloads it and swaps the mmap file pointer. In-flight queries complete against the old graph. New queries use the new graph. No server restart needed.

CCH weight updates. Every 30 seconds, the routing server reads updated segment speeds from Redis (populated by the Flink traffic pipeline). Changed edge weights trigger bottom-up shortcut recalculation. This typically takes < 500ms for a batch of 10,000 changed segments.

When Things Go Wrong

Stale topology. A new road detected from satellite imagery does not appear in routing until the next weekly graph rebuild. Workaround: for critical changes (highway openings, road closures), apply CCH weight overrides immediately. Set a closed road's weight to infinity. Set a new road's weight to its estimated travel time. These overrides persist until the next full rebuild incorporates the topology change.

Weight update lag. Between CCH weight update cycles (30-second intervals), routing uses the previous cycle's weights. During sudden traffic changes (accident, concert ending), routes may be suboptimal for up to 30 seconds. For most users, this lag is imperceptible.

Memory pressure. Shortcuts can 2-3x the edge count. On a server loading multiple regional shards, total memory for graph data can reach 20-30 GB. Memory-mapped files mitigate this: the OS only pages in the actively queried portions. Cold regions (rural areas rarely queried) stay on disk. Monitor page fault rates to ensure hot data fits in physical RAM.

Why Shortest Path Is Hard at Scale

How Contraction Works

Basic Terms

Term	Meaning	Example
Edge	A connection between nodes with a cost (distance or time)	A ---4--- B means cost(A,B) = 4
Cost	Total weight of a path	A -> B -> C = 4 + 2 = 6
Shortcut	A new edge added to replace a removed node's path	A -> B -> C (6) becomes A -> C (6)
Contraction	Removing a node from the graph (after adding required shortcuts)	Remove E, add D-F shortcut
Level	Order in which nodes are removed. Removed early = low level. Removed late = high level	E = level 0, G = level 5

Base Graph

    A ---4--- B ---2--- C
    |                   |
    3                   5
    |                   |
    D ---1--- E ---1--- F ---3--- G

Preprocessing: Complete Step-by-Step

Step 1: Remove E (least important, only 1 shortcut needed)

Neighbors: D, F. Path through E: D -> E -> F = 1 + 1 = 2. No alternative exists. Add shortcut: D-F (weight 2).

    A ---4--- B ---2--- C
    |                   |
    3                   5
    |                   |
    D -------2-------- F ---3--- G

Assign: E = level 0

Step 2: Remove D

Neighbors: A, F. Path through D: A -> D -> F = 3 + 2 = 5. Alternative: A -> B -> C -> F = 4 + 2 + 5 = 11 (worse). Add shortcut: A-F (weight 5).

    A ---4--- B ---2--- C
     \                  |
      \                 5
       \               /
        \-----5------ F ---3--- G

Assign: D = level 1

Step 3: Remove B

Neighbors: A, C. Path through B: A -> B -> C = 4 + 2 = 6. Alternative: A -> F -> C = 5 + 5 = 10 (worse). Add shortcut: A-C (weight 6).

    A -------6-------- C
     \                |
      \               5
       \             /
        \----5----- F ---3--- G

Assign: B = level 2

Step 4: Remove C

Neighbors: A, F. Path through C: A -> C -> F = 6 + 5 = 11. Alternative: A -> F = 5 (better). No shortcut needed.

    A --------5-------- F ---3--- G

Assign: C = level 3

Step 5: Remove F

Neighbors: A, G. Path through F: A -> F -> G = 5 + 3 = 8. No alternative exists. Add shortcut: A-G (weight 8).

    A --------8-------- G

Assign: F = level 4

Step 6: Remove G

No neighbors left. Assign: G = level 5.

Final levels:

Node	Level	Importance
E	0	Lowest (removed first)
D	1
B	2
C	3
F	4
G	5	Highest (removed last)

Query Phase

Goal: shortest path A -> G. Use original edges + all shortcuts + the "only move to higher-level nodes" rule.

Forward search (from A):

Expand	Reach	Cost	Via
A	B	4	original edge
A	F	5	shortcut A-F
B	C	6	original edge
F	G	8	original edge

Forward distances: A=0, B=4, F=5, C=6, G=8

Backward search (from G):

Expand	Reach	Cost	Via
G	F	3	original edge
F	A	8	shortcut F-A

Backward distances: G=0, F=3, A=8

Meeting point: Both searches reached F and A.

Common node	Forward + Backward	Total
F	5 + 3	8
A	0 + 8	8

Answer: cost = 8, path = A -> F -> G

Path unpacking: A -> F is a shortcut (from step 2). Unpack: A -> D -> F. D -> F is a shortcut (from step 1). Unpack: D -> E -> F. Full path: A -> D -> E -> F -> G (cost 3 + 1 + 1 + 3 = 8).

Why This Works

Shortcuts preserve shortest distances exactly. No information is lost.
The graph becomes "compressed." 7 nodes and 7 edges become 2 nodes and 1 shortcut for the query.
Queries skip low-level nodes entirely. Forward and backward searches climb the hierarchy and meet quickly.
On the global road network (500M nodes), this same process produces a hierarchy where queries explore ~500 nodes total, regardless of route length.

Performance Numbers

Metric	Dijkstra	A*	Contraction Hierarchies
Preprocessing	None	None	~10 min (Germany), 4-6 hours (global)
Nodes explored per query	~2,000,000	~500,000	~500
Query time	~2 seconds	~400ms	< 1ms
Speedup over Dijkstra	1x	3-5x	~2,000x

Real-Time Traffic with CH

The Problem: Static Weights Break

CH preprocessing bakes edge weights into shortcuts. When traffic changes, those shortcuts become wrong.

Concrete example:

During preprocessing, the edge A->B has weight 5 (minutes). The shortcut A->C is created with weight 7 (A->B + B->C = 5 + 2).

Normal:     A ---5--- B ---2--- C     →  Shortcut A-C = 7
Traffic:    A --10--- B ---2--- C     →  Real cost = 12, but shortcut still says 7

Solution 1: CCH (Customizable Contraction Hierarchies)

The key idea: separate structure (graph shape, node ordering, which shortcuts exist) from weights (edge costs).

Step 2: Store shortcut structure. After preprocessing, we know that shortcut A->C exists and passes through B. But we do not fix its weight permanently.

Step 3: Real-time weight update (every 30 seconds). When traffic changes edge A->B from 5 to 10:

Identify which shortcuts contain edge A->B
Recompute only those shortcut weights: A->C = 10 + 2 = 12
Propagate upward through the hierarchy to update any higher-level shortcuts that depend on A->C

Only the affected portion of the graph is updated. Most shortcuts are untouched.

Performance:

Local update (a few thousand segments): ~100ms
Full weight refresh (all shortcuts): ~2 seconds
No full reprocessing needed

The core insight: CH structure (levels + shortcut edges) stays the same. Only weights change. Think of it as a skeleton (CH) with live blood flow (traffic updates).

Solution 2: Multiple CH Graphs

A simpler approach: precompute separate CH graphs for different traffic scenarios.

Graph 1: free-flow (3 AM weights)
Graph 2: rush hour (8 AM weights)
Graph 3: weekend (Saturday weights)

At runtime, select the graph matching current conditions. Simple to implement, but cannot handle real-time changes (only switches between pre-built scenarios). Limited to a small number of profiles.

Solution 3: CRP (Customizable Route Planning)

Developed by Microsoft Research, used in Bing Maps. The idea: split the graph into geographic cells and precompute shortest paths between cell boundaries.

A query becomes: route within source cell, cross cell boundaries using precomputed distances, route within destination cell.

Advantage: Supports multiple routing metrics (fastest, shortest, eco-friendly, truck-safe) from a single preprocessing pass. CCH needs separate weight sets per metric.

Comparison

Approach	Weight updates	Metric flexibility	Query speed	Used by
Static CH	Requires full reprocessing	One metric per build	< 1ms	OSRM (default mode)
CCH	~2s full, ~100ms local	One metric per weight set	< 1ms	Google Maps
Multi-CH	Switch between pre-built graphs	Limited scenarios	< 1ms	Simpler deployments
CRP	~2s full	Multiple metrics from one build	2-5ms	Bing Maps

When to Rebuild vs When to Update

Full CH reprocessing required (topology changed):

New roads added (flyover, highway extension)
Intersections changed or deleted
Road classification changed (residential upgraded to primary)
Trigger: weekly batch pipeline

CCH weight update sufficient (only costs changed):

Traffic congestion
Speed limit correction from probe data
Temporary road blockage
Construction zone
Trigger: every 30 seconds from Redis traffic data

Real-world effect:

New road detected from satellite imagery: visible in routing in ~1 week (next rebuild)
Traffic speed change: visible in routing in ~30 seconds (CCH update)

Production Deployment Pattern

The global graph (~500M nodes) is split into ~50 regional shards, each 5-12 GB. Western Europe, Eastern US, Southeast Asia, and so on. Each routing server memory-maps one or more shards.

Routes that cross region boundaries use a transit node overlay: precomputed shortest-path distances between border nodes, stitched with 3 CH queries. Total cross-region query time: ~3-5ms.

Architecture

Why Shortest Path Is Hard at Scale

How Contraction Works

Basic Terms

Base Graph

Preprocessing: Complete Step-by-Step

Query Phase

Why This Works

Performance Numbers

Real-Time Traffic with CH

The Problem: Static Weights Break

Solution 1: CCH (Customizable Contraction Hierarchies)

Solution 2: Multiple CH Graphs

Solution 3: CRP (Customizable Route Planning)

Comparison

When to Rebuild vs When to Update

Production Deployment Pattern

When Things Go Wrong

Key Points

Used By

Common Mistakes

Related

Architecture

Why Shortest Path Is Hard at Scale

How Contraction Works

Basic Terms

Base Graph

Preprocessing: Complete Step-by-Step

Query Phase

Why This Works

Performance Numbers

Real-Time Traffic with CH

The Problem: Static Weights Break

Solution 1: CCH (Customizable Contraction Hierarchies)

Solution 2: Multiple CH Graphs

Solution 3: CRP (Customizable Route Planning)

Comparison

When to Rebuild vs When to Update

Production Deployment Pattern

When Things Go Wrong

Key Points

Used By

Common Mistakes

Related