Cuckoo Hashing & Cuckoo Filters
Architecture
The Bloom Filter Gap
Bloom filters are one of the most widely deployed data structures in distributed systems. But they have a hard limitation: you cannot delete elements. If you clear the bits for element A, you might corrupt the membership information for element B that shares those bit positions. You get false negatives, which breaks the core guarantee.
Counting Bloom filters work around this by using counters instead of bits, but they use 3-4x more memory. And counter overflow is a real risk under adversarial or high-cardinality workloads.
Cuckoo filters, proposed by Fan, Andersen, Kaminsky, and Mitzenmacher in 2014, solve deletion cleanly. They support insert, lookup, and delete with better space efficiency than counting Bloom filters and comparable performance to standard Bloom filters.
Cuckoo Hashing: The Foundation
Before understanding cuckoo filters, you need cuckoo hashing.
Standard hash tables resolve collisions by chaining (linked lists in each bucket) or open addressing (probe to the next empty slot). Both degrade as the table fills up.
Cuckoo hashing uses two hash functions and gives each element two possible homes. To insert element X:
- Compute both positions:
pos1 = h1(x)andpos2 = h2(x). - If either position is empty, put X there.
- If both are occupied, pick one (say pos1), evict the element Y that is currently there, and insert X at pos1.
- Now Y needs a new home. Compute Y's other position and try to place it there.
- If that position is occupied, evict again.
- Repeat until everyone has a home or you hit a maximum displacement count (indicating the table is too full and needs resizing).
Lookup is O(1) in the worst case: check both positions. If the element is not in either, it is not in the table. No chains to walk, no probing sequences.
The name "cuckoo" comes from the cuckoo bird, which lays its eggs in other birds' nests and evicts the existing eggs. Same energy.
From Cuckoo Hash Table to Cuckoo Filter
A cuckoo hash table stores the full elements. A cuckoo filter stores only fingerprints (short hashes) of elements. This is the same conceptual leap as going from a hash set to a Bloom filter: trade exact membership for space-efficient probabilistic membership.
The trick is computing the second bucket location from just the fingerprint, without storing the original element. Cuckoo filters use partial-key cuckoo hashing:
bucket1 = hash(element)
bucket2 = bucket1 XOR hash(fingerprint)
The XOR trick makes the operation reversible. Given an element in bucket1 with fingerprint fp, you can compute bucket2. Given the same element in bucket2, you can compute bucket1 by XORing with the same hash(fingerprint). This means you can relocate a fingerprint during eviction without knowing the original element.
Insert
- Compute fingerprint f = hash(element).
- Compute bucket1 = hash(element) and bucket2 = bucket1 XOR hash(f).
- If bucket1 or bucket2 has an empty slot, store f there.
- If both are full, pick one, evict a random existing fingerprint, and relocate it to its alternative bucket (computed via XOR).
- Repeat relocations until an empty slot is found or a maximum number of kicks is reached (typically 500).
Lookup
- Compute fingerprint f = hash(element).
- Compute bucket1 = hash(element) and bucket2 = bucket1 XOR hash(f).
- Check both buckets for a matching fingerprint. If found, return "probably present." If not, return "definitely not present."
Delete
- Compute fingerprint and both bucket locations (same as lookup).
- Find a matching fingerprint in either bucket and remove it.
This is clean and correct as long as the element was actually inserted. Deleting an element that was never inserted is dangerous: you might remove a fingerprint that belongs to a different element that happened to have the same fingerprint. This creates a false negative.
Space Efficiency
The false positive rate of a cuckoo filter depends on the fingerprint size f and the bucket size b (number of entries per bucket):
FPR ≈ 1 / (2^f × (2b/3))
With 4-way buckets (b=4) and 12-bit fingerprints:
- FPR ≈ 1 / (4096 × 2.67) ≈ 0.009%
With 8-bit fingerprints and 4-way buckets:
- FPR ≈ 1 / (256 × 2.67) ≈ 0.15%
Space per element = f / occupancy. With 95% occupancy and 12-bit fingerprints: 12 / 0.95 ≈ 12.6 bits per element. Bloom filters at the same FPR use about 14.4 bits per element. Cuckoo filters win.
But this comparison flips at very low FPRs. Below about 0.1% FPR, Bloom filters need longer fingerprints, but cuckoo filters need them too, and the overhead of the hash table structure (bucket boundaries, alignment) starts to dominate. In practice, cuckoo filters are most attractive in the 0.1% to 3% FPR range.
4-Way Buckets: Why More Entries Per Bucket Helps
A cuckoo hash table with 1 entry per bucket can only reach about 50% occupancy before insertions start failing. With 2 entries per bucket, occupancy reaches about 84%. With 4 entries per bucket, it reaches about 95%.
Why? More entries per bucket means more options during eviction. When you kick an element from a bucket, its alternative bucket has 4 slots instead of 1. The probability of finding an empty slot during the displacement chain increases dramatically.
Four entries per bucket is the sweet spot. Eight entries give marginal improvement (96-97% occupancy) but hurt cache performance because each lookup reads a larger memory region.
Production Use: DPDK Packet Classification
DPDK (Data Plane Development Kit) uses cuckoo hash tables for packet classification in network functions. When a packet arrives, the system looks up the flow (source IP, destination IP, ports) in a cuckoo hash table to determine routing rules.
Why cuckoo hashing? Worst-case O(1) lookup. Network packet processing budgets are measured in nanoseconds. A hash table with chaining or linear probing can degrade to O(n) under adversarial traffic patterns (hash collision attacks). Cuckoo hashing guarantees two memory accesses, regardless of the data.
DPDK also exploits the two-bucket structure for prefetching. When processing a batch of packets, compute both bucket addresses for all packets first, issue prefetch instructions for all of them, then check the buckets. The prefetches hide memory latency by overlapping computation and memory access.
Ribbon Filters: The Next Generation
Google proposed Ribbon filters (2021) as a successor to both Bloom and cuckoo filters for storage engine use. Ribbon filters use a technique called homogeneous linear probing on a ribbon (a band matrix) to achieve near-optimal space efficiency at any target FPR.
RocksDB added Ribbon filters as an option alongside Bloom filters. The space savings are about 30% compared to Bloom filters at the same FPR, with similar query performance but slower construction. For storage engines where filters are built once (during SSTable creation) and queried many times, the construction cost is an acceptable trade-off.
When to Choose What
Bloom filter. Append-only workloads, well-understood requirements, maximum implementation simplicity. You never delete, and you want the most battle-tested option with the largest body of production experience.
Cuckoo filter. Workloads that need deletion, FPR targets between 0.1% and 3%, or situations where worst-case lookup performance matters (always exactly 2 memory accesses).
Counting Bloom filter. When you need deletion but your code already uses Bloom filters extensively and you do not want to switch data structures. Accept the 4x memory penalty.
Ribbon filter. Storage engine internals where filters are built once and queried billions of times. The space savings compound over millions of SSTables.
Key Points
- •Cuckoo filters store fingerprints of elements in a cuckoo hash table, enabling both membership queries and deletion. This is the key advantage over Bloom filters, which cannot delete elements without switching to a counting variant that uses 3-4x more memory
- •Each element maps to two candidate buckets via two hash functions. Lookup checks both buckets for a matching fingerprint. If found in either, the element is probably present. If not found in either, it is definitely not present
- •At false positive rates below about 3%, cuckoo filters use less space per element than Bloom filters. At a 1% FPR, a cuckoo filter uses roughly 12 bits per element versus a Bloom filter's 10 bits, but the gap narrows and reverses at lower FPRs
- •The maximum load factor is about 95% with 4-way buckets. Beyond that, insertions start failing because displacement chains cannot find empty slots. Bloom filters degrade gracefully when overfull; cuckoo filters hit a wall
- •DPDK uses cuckoo hashing for high-performance packet lookup tables. RocksDB explored cuckoo table format as an alternative to block-based tables for in-memory workloads with pure point lookups
Used By
Common Mistakes
- ✗Not planning for the capacity limit. Once a cuckoo filter is about 95% full, insertions fail. You cannot just keep adding elements like you can with a Bloom filter (where the false positive rate rises but inserts never fail). Size your filter for the maximum expected element count, not the initial count
- ✗Using cuckoo filters for append-only workloads where deletion is not needed. If you never delete, Bloom filters are simpler, well-understood, and have extensive production track records. The whole point of cuckoo filters is deletion support
- ✗Storing too-short fingerprints. Shorter fingerprints save space but increase false positive rates. With f-bit fingerprints and b entries per bucket, the FPR is approximately 1/(2^f * b). Use at least 8-bit fingerprints for reasonable FPRs
- ✗Inserting the same element multiple times. Each insertion stores a new fingerprint. If you insert element X three times, it occupies three slots. The third insertion might even evict unrelated elements. Always check for membership before inserting if duplicates are possible