Greedy (Intervals & Scheduling)
The Intuition
Imagine you have a full day of back-to-back conference talks, and you want to attend as many as possible. Each talk has a start time and an end time, and some of them overlap. You can't be in two rooms at once, so you need a strategy for picking which talks to attend.
Here's the trick: sort all the talks by when they end, earliest first. Pick the talk that ends soonest. Once it's done, scan forward and grab the next talk whose start time is at or after the previous talk's end time. Repeat. You're always choosing the talk that finishes earliest and frees you up as soon as possible for whatever comes next.
Why does finishing early beat every other strategy? Suppose you picked a longer talk instead. It would block out more of your afternoon, potentially covering two or three shorter talks you could have attended. By always freeing up your schedule as early as possible, you leave maximum room for future talks. No alternative choice can squeeze in more talks than this greedy approach. You can prove it formally with an "exchange argument": swapping any talk in an optimal solution for an earlier-ending one never makes things worse.
For the "minimum meeting rooms" variant, you flip the question. Instead of maximizing how many you attend, you're asking how many rooms you need to host all meetings. Sort by start time, and use a heap to track when each room becomes free. When a new meeting starts, check if any room is available (its current meeting ended). If yes, reuse it. If not, open a new room. The heap size at the end is your answer.
Activity selection, meeting rooms, job scheduling, interval partitioning, minimum platforms, task assignment, or any problem where a locally optimal choice provably leads to a global optimum.
- Intervals have dependencies/weights requiring optimization (use DP)
- Need to find all possible valid schedules (use backtracking)
- Intervals are not sortable by a clear criterion
Variations:
- Activity selection: Sort by end time, pick non-overlapping -- maximizes count.
- Meeting rooms: Sort by start time, use a min-heap of end times -- minimizes rooms.
- Erase overlaps: Sort by end time, count overlapping intervals to remove.
- Weighted scheduling: Combine greedy with DP when intervals have different values.
- No intervals overlap at all
- All intervals overlap with each other
- Single interval in the input
Key Points
- •Sort by end time for maximum non-overlapping intervals
- •Sort by start time for minimum meeting rooms
- •Greedy choice: always pick the option that leaves the most room for future choices
- •Proving greedy correctness: exchange argument or stays-ahead argument
- •Often combined with sorting + heap for scheduling problems
Code Template
1 def max_non_overlapping(intervals):
2 """Maximum number of non-overlapping intervals (activity selection)."""
3 intervals.sort(key=lambda x: x[1]) # sort by end time
4 count = 0
5 end = float('-inf')
6
7 for start, finish in intervals:
8 if start >= end:
9 count += 1
10 end = finish
11
12 return count
13
14 def min_meeting_rooms(intervals):
15 """Minimum meeting rooms needed (= max overlap)."""
16 import heapq
17 intervals.sort(key=lambda x: x[0]) # sort by start time
18 heap = [] # tracks end times of active meetings
19
20 for start, end in intervals:
21 if heap and heap[0] <= start:
22 heapq.heappop(heap) # reuse room
23 heapq.heappush(heap, end)
24
25 return len(heap)
26
27 def erase_overlap_intervals(intervals):
28 """Minimum intervals to remove to make rest non-overlapping."""
29 intervals.sort(key=lambda x: x[1])
30 count = 0
31 end = float('-inf')
32
33 for start, finish in intervals:
34 if start >= end:
35 end = finish
36 else:
37 count += 1 # remove this interval
38
39 return countCommon Mistakes
- Sorting by start time when you should sort by end time (or vice versa)
- Not proving the greedy choice is optimal -- greedy doesn't always work
- Confusing activity selection (max count) with interval scheduling (min rooms)