Hash Map & Frequency Counter
The Intuition
A hash map stores key-value pairs and answers "do you have this key" in constant time on average. That single property converts a large class of problems from O(n^2) to O(n). The trick is identifying what to use as the key and what to store as the value.
Three patterns cover most interview problems. The first is complement lookup. For two-sum, instead of checking every pair you scan once and ask "have I already seen the value that completes the target?" If yes, return the stored index alongside the current one. If no, record the current value and keep scanning.
The second is frequency counting. Walk the input once and tally how many times each element appears. Once you have counts, downstream questions like "first non-repeating character" or "most common element" become a second linear scan.
The third is grouping by canonical form. If the question asks you to bucket items that are equivalent under some transformation (anagrams under letter sorting, points on the same line under slope, words under stem), compute the canonical form once per item and use it as a hash key.
The space cost is O(n) and the trade is almost always worth it. The exceptions are when the universe of keys is very large and sparse (use a hash set instead of a map), or when memory is tight and a sorted array with binary search would do.
Pair or triple sum, frequency counting, deduplication, grouping by equivalence, sliding-window character counts, prefix-sum lookups, and any problem where O(1) "have I seen X" turns a quadratic loop into a linear one.
- Order of insertion or sorting matters and you cannot afford to track it separately
- Keys are floating point and equality is fuzzy
- Memory is constrained and the input is already sorted (binary search wins)
Variations:
- Hash set: When you only need membership, not a stored value. Smaller memory footprint.
- Hash map of arrays: Group multiple items under one key (graph adjacency, anagram buckets).
- Hash map with prefix sum: Map a running sum to its earliest index for "subarray with sum K" problems.
- empty input
- duplicates that should map to the same key
- single element
Key Points
- •Hash maps give O(1) average lookup, insert, and delete
- •Convert nested loops to single pass by remembering what you have already seen
- •For pair-sum problems, store complement values keyed by index
- •For frequency problems, key by element and store the count
- •For grouping problems (anagrams, equivalents), key by a canonical form
Code Template
1 def two_sum(nums, target):
2 """Return indices of the two numbers that sum to target."""
3 seen = {}
4 for i, n in enumerate(nums):
5 complement = target - n
6 if complement in seen:
7 return [seen[complement], i]
8 seen[n] = i
9 return []
10
11 def char_frequency(s):
12 """Build a character frequency map."""
13 freq = {}
14 for c in s:
15 freq[c] = freq.get(c, 0) + 1
16 return freq
17
18 def group_anagrams(words):
19 """Group words by sorted-letter signature."""
20 groups = {}
21 for w in words:
22 key = "".join(sorted(w))
23 groups.setdefault(key, []).append(w)
24 return list(groups.values())Common Mistakes
- Iterating the map while mutating it (use a copy or collect keys first)
- Hashing mutable objects whose hash changes after insertion
- Returning the wrong key vs value when the answer expects indices
- Forgetting that dict ordering is insertion order in Python 3.7+ but not relied on across languages