Sweep Line (Event Sort)
The Intuition
A lot of interval problems look complicated until each interval is broken into two events: one for the start, one for the end. Once that translation happens, every interval-overlap question collapses into a single sweep over a sorted list of events with a running counter.
Take the canonical example: minimum number of meeting rooms. Each meeting becomes two events. The start emits +1 because a new meeting needs a room. The end emits -1 because a room frees up. Sort all events by time. Walk them in order. The running sum is the number of concurrent meetings at that point in time. The maximum value of that running sum is the answer.
The tie-break at equal timestamps is the only subtle part. If a meeting ends at 5 and another meeting starts at 5, do they share a room? No: the first room is freed before the second meeting begins. To enforce that, the -1 event must sort before the +1 event at the same timestamp. Tuple sorting (time first, then delta) does this automatically because -1 < +1. If the problem's semantics are different (intervals with shared endpoints DO overlap), reverse the tie-break.
The code below works through three LeetCode problems in this family.
Meeting Rooms II (LC 253). Given a list of meetings as [start, end] intervals, return the minimum number of rooms needed to schedule all of them without conflict. Build start/end events, sort, walk, track the maximum running count.
Car Pooling (LC 1094). Given a list of trips where each trip is [numPassengers, from, to] and a vehicle capacity, return whether the driver can complete every trip without exceeding capacity at any moment. Build pick-up/drop-off events with weights equal to the passenger count, sort, walk, return false the first time the running total exceeds capacity.
Number of Flowers in Full Bloom (LC 2251). Given a set of flower bloom intervals (each is [start, end] inclusive on both ends) and a list of query days, return how many flowers are blooming on each query day. The first two problems are pure single-pass sweeps. This one is a per-query variant: precompute snapshots of the running count at every event time, then binary-search each query day to find the latest snapshot at or before that day.
The same machinery handles a lot of variants:
- Car pooling: events are passenger pick-ups (+num) and drop-offs (-num). Capacity check at each event.
- Maximum population year: each person contributes (birth, +1) and (death, -1) events. Track running population.
- Number of flowers in full bloom for each query day: precompute prefix counts at every event time, then binary-search each query.
- Skyline problem: each building emits a "tallest height changed here" event. The state is a multiset of active heights, peeked with a heap.
The technique is O(n log n) because of the sort. The sweep itself is linear. For very large inputs with bounded coordinate ranges, a difference array can replace the sweep for true O(n + range) time, but the sweep is the universal solution.
When to use
- Intervals with overlap, capacity, or concurrency questions
- Each interval has a clean start and end and the question concerns "what is true at every point in time"
- Need to answer per-query questions about a fixed set of intervals (combine with binary search)
- Multiple criteria at the same timestamp (tie-break order encodes semantics)
When NOT to use
- Coordinates are very small and dense; a difference array is faster (no sort needed)
- Need to actually merge intervals into a smaller list (use the merge intervals template instead)
- Question asks about a single interval at a time (sweep is overkill)
- Need to track per-interval identity (sweep loses that; carry an id field in the event struct if needed)
Pattern Recognition
- "minimum number of rooms / cars / arrows / runways"
- "at any moment, what is the maximum / number of active X"
- "given queries about specific times, count active intervals"
- "skyline" or "horizon" of overlapping rectangles
Variations
- Difference array (no sort needed): When timestamps are integers in a small range, allocate
delta[0..max_time], increment at starts and decrement at ends, then prefix-sum. O(n + range) time, O(range) space. - With heap for state: When the state is more than a counter (e.g. set of heights for skyline), use a max-heap to track active items. Pop expired entries lazily as the sweep moves forward.
- Two-dimensional sweep: Sort by x-coordinate, sweep along y. Used in computational geometry.
- Offline answering of queries: Sort queries together with events by time. Each query is processed when its time is reached during the sweep.
Edge Cases
- Empty input (return 0 or true depending on the problem)
- Single interval (max = 1, answer is trivially derivable)
- All intervals share the same start or end point (tie-break correctness is tested)
- Intervals where start equals end (zero-length; usually contributes neither, but check the spec)
- Negative coordinates (the sort still works; difference array does not without offset)
Practice Problems
- [Medium] Meeting Rooms II (premium)
- [Medium] Car Pooling
- [Medium] Maximum Population Year
- [Hard] The Skyline Problem
- [Hard] Number of Flowers in Full Bloom
Key Points
- •Convert each interval into two events (start with delta +1, end with delta -1)
- •Sort all events by time, with smaller delta breaking ties to handle endpoint semantics
- •Walk events in order, maintain a running counter, and track the maximum
- •Choice of tie-breaker decides whether intervals with shared endpoints overlap or not
- •The same shape generalizes from intervals to skylines, calendars, and traffic
Code Template
1 # Meeting Rooms II (LC 253). Minimum number of rooms required to host all meetings.
2 def min_meeting_rooms(intervals):
3 events = []
4 for start, end in intervals:
5 events.append((start, 1)) # +1 room when a meeting starts
6 events.append((end, -1)) # -1 room when a meeting ends
7 # Tuple sort: at the same timestamp, -1 (end) sorts before +1 (start),
8 # so a meeting ending at 5 frees the room before a meeting starting at 5.
9 events.sort()
10 rooms = 0
11 max_rooms = 0
12 for _, delta in events:
13 rooms += delta
14 if rooms > max_rooms:
15 max_rooms = rooms
16 return max_rooms
17
18 # Car Pooling (LC 1094). Can the driver complete every trip without exceeding capacity?
19 def car_pooling(trips, capacity):
20 events = []
21 for num, start, end in trips:
22 events.append((start, num)) # passengers get on
23 events.append((end, -num)) # passengers get off (before any new pick-up at the same point)
24 events.sort()
25 passengers = 0
26 for _, delta in events:
27 passengers += delta
28 if passengers > capacity:
29 return False
30 return True
31
32 # Number of Flowers in Full Bloom (LC 2251). For each query day, count flowers in bloom.
33 def full_bloom_flowers(flowers, people):
34 events = []
35 for start, end in flowers:
36 events.append((start, 1))
37 events.append((end + 1, -1)) # +1 because end is inclusive
38 events.sort()
39 # Precompute (time, count_after_processing_all_events_up_to_time)
40 times = []
41 counts = []
42 running = 0
43 i = 0
44 while i < len(events):
45 t = events[i][0]
46 while i < len(events) and events[i][0] == t:
47 running += events[i][1]
48 i += 1
49 times.append(t)
50 counts.append(running)
51 # Binary search each query into times to find the latest event <= query day.
52 from bisect import bisect_right
53 result = []
54 for p in people:
55 idx = bisect_right(times, p) - 1
56 result.append(counts[idx] if idx >= 0 else 0)
57 return resultCommon Mistakes
- Wrong tie-break order at shared endpoints (treating "ends at 5, starts at 5" as overlap)
- Forgetting that the running counter is what matters, not the event count
- Sorting only by time and ignoring the delta when ties exist
- Storing intervals instead of events and trying to scan with two pointers (more complex than sweep)