Cuckoo Hashing & Cuckoo Filters

The Bloom Filter Gap

Bloom filters are one of the most widely deployed data structures in distributed systems. But they have a hard limitation: you cannot delete elements. If you clear the bits for element A, you might corrupt the membership information for element B that shares those bit positions. You get false negatives, which breaks the core guarantee.

Counting Bloom filters work around this by using counters instead of bits, but they use 3-4x more memory. And counter overflow is a real risk under adversarial or high-cardinality workloads.

Cuckoo filters, proposed by Fan, Andersen, Kaminsky, and Mitzenmacher in 2014, solve deletion cleanly. They support insert, lookup, and delete with better space efficiency than counting Bloom filters and comparable performance to standard Bloom filters.

Cuckoo Hashing: The Foundation

Before understanding cuckoo filters, you need cuckoo hashing.

Standard hash tables resolve collisions by chaining (linked lists in each bucket) or open addressing (probe to the next empty slot). Both degrade as the table fills up.

Cuckoo hashing uses two hash functions and gives each element two possible homes. To insert element X:

Compute both positions: pos1 = h1(x) and pos2 = h2(x).
If either position is empty, put X there.
If both are occupied, pick one (say pos1), evict the element Y that is currently there, and insert X at pos1.
Now Y needs a new home. Compute Y's other position and try to place it there.
If that position is occupied, evict again.
Repeat until everyone has a home or you hit a maximum displacement count (indicating the table is too full and needs resizing).

Lookup is O(1) in the worst case: check both positions. If the element is not in either, it is not in the table. No chains to walk, no probing sequences.

The name "cuckoo" comes from the cuckoo bird, which lays its eggs in other birds' nests and evicts the existing eggs. Same energy.

From Cuckoo Hash Table to Cuckoo Filter

A cuckoo hash table stores the full elements. A cuckoo filter stores only fingerprints (short hashes) of elements. This is the same conceptual leap as going from a hash set to a Bloom filter: trade exact membership for space-efficient probabilistic membership.

The trick is computing the second bucket location from just the fingerprint, without storing the original element. Cuckoo filters use partial-key cuckoo hashing:

bucket1 = hash(element)
bucket2 = bucket1 XOR hash(fingerprint)

The XOR trick makes the operation reversible. Given an element in bucket1 with fingerprint fp, you can compute bucket2. Given the same element in bucket2, you can compute bucket1 by XORing with the same hash(fingerprint). This means you can relocate a fingerprint during eviction without knowing the original element.

Insert

Compute fingerprint f = hash(element).
Compute bucket1 = hash(element) and bucket2 = bucket1 XOR hash(f).
If bucket1 or bucket2 has an empty slot, store f there.
If both are full, pick one, evict a random existing fingerprint, and relocate it to its alternative bucket (computed via XOR).
Repeat relocations until an empty slot is found or a maximum number of kicks is reached (typically 500).

Lookup

Compute fingerprint f = hash(element).
Compute bucket1 = hash(element) and bucket2 = bucket1 XOR hash(f).
Check both buckets for a matching fingerprint. If found, return "probably present." If not, return "definitely not present."

Delete

Compute fingerprint and both bucket locations (same as lookup).
Find a matching fingerprint in either bucket and remove it.

This is clean and correct as long as the element was actually inserted. Deleting an element that was never inserted is dangerous: you might remove a fingerprint that belongs to a different element that happened to have the same fingerprint. This creates a false negative.

Space Efficiency

The false positive rate of a cuckoo filter depends on the fingerprint size f and the bucket size b (number of entries per bucket):

FPR ≈ 1 / (2^f × (2b/3))

With 4-way buckets (b=4) and 12-bit fingerprints:

FPR ≈ 1 / (4096 × 2.67) ≈ 0.009%

With 8-bit fingerprints and 4-way buckets:

FPR ≈ 1 / (256 × 2.67) ≈ 0.15%

Space per element = f / occupancy. With 95% occupancy and 12-bit fingerprints: 12 / 0.95 ≈ 12.6 bits per element. Bloom filters at the same FPR use about 14.4 bits per element. Cuckoo filters win.

But this comparison flips at very low FPRs. Below about 0.1% FPR, Bloom filters need longer fingerprints, but cuckoo filters need them too, and the overhead of the hash table structure (bucket boundaries, alignment) starts to dominate. In practice, cuckoo filters are most attractive in the 0.1% to 3% FPR range.

4-Way Buckets: Why More Entries Per Bucket Helps

A cuckoo hash table with 1 entry per bucket can only reach about 50% occupancy before insertions start failing. With 2 entries per bucket, occupancy reaches about 84%. With 4 entries per bucket, it reaches about 95%.

Why? More entries per bucket means more options during eviction. When you kick an element from a bucket, its alternative bucket has 4 slots instead of 1. The probability of finding an empty slot during the displacement chain increases dramatically.

Four entries per bucket is the sweet spot. Eight entries give marginal improvement (96-97% occupancy) but hurt cache performance because each lookup reads a larger memory region.

Production Use: DPDK Packet Classification

DPDK (Data Plane Development Kit) uses cuckoo hash tables for packet classification in network functions. When a packet arrives, the system looks up the flow (source IP, destination IP, ports) in a cuckoo hash table to determine routing rules.

Why cuckoo hashing? Worst-case O(1) lookup. Network packet processing budgets are measured in nanoseconds. A hash table with chaining or linear probing can degrade to O(n) under adversarial traffic patterns (hash collision attacks). Cuckoo hashing guarantees two memory accesses, regardless of the data.

DPDK also exploits the two-bucket structure for prefetching. When processing a batch of packets, compute both bucket addresses for all packets first, issue prefetch instructions for all of them, then check the buckets. The prefetches hide memory latency by overlapping computation and memory access.

Ribbon Filters: The Next Generation

Google proposed Ribbon filters (2021) as a successor to both Bloom and cuckoo filters for storage engine use. Ribbon filters use a technique called homogeneous linear probing on a ribbon (a band matrix) to achieve near-optimal space efficiency at any target FPR.

RocksDB added Ribbon filters as an option alongside Bloom filters. The space savings are about 30% compared to Bloom filters at the same FPR, with similar query performance but slower construction. For storage engines where filters are built once (during SSTable creation) and queried many times, the construction cost is an acceptable trade-off.

When to Choose What

Bloom filter. Append-only workloads, well-understood requirements, maximum implementation simplicity. You never delete, and you want the most battle-tested option with the largest body of production experience.

Cuckoo filter. Workloads that need deletion, FPR targets between 0.1% and 3%, or situations where worst-case lookup performance matters (always exactly 2 memory accesses).

Counting Bloom filter. When you need deletion but your code already uses Bloom filters extensively and you do not want to switch data structures. Accept the 4x memory penalty.

Ribbon filter. Storage engine internals where filters are built once and queried billions of times. The space savings compound over millions of SSTables.

The Bloom Filter Gap

Counting Bloom filters work around this by using counters instead of bits, but they use 3-4x more memory. And counter overflow is a real risk under adversarial or high-cardinality workloads.

Cuckoo Hashing: The Foundation

Before understanding cuckoo filters, you need cuckoo hashing.

Standard hash tables resolve collisions by chaining (linked lists in each bucket) or open addressing (probe to the next empty slot). Both degrade as the table fills up.

Cuckoo hashing uses two hash functions and gives each element two possible homes. To insert element X:

Compute both positions: pos1 = h1(x) and pos2 = h2(x).
If either position is empty, put X there.
If both are occupied, pick one (say pos1), evict the element Y that is currently there, and insert X at pos1.
Now Y needs a new home. Compute Y's other position and try to place it there.
If that position is occupied, evict again.
Repeat until everyone has a home or you hit a maximum displacement count (indicating the table is too full and needs resizing).

Lookup is O(1) in the worst case: check both positions. If the element is not in either, it is not in the table. No chains to walk, no probing sequences.

The name "cuckoo" comes from the cuckoo bird, which lays its eggs in other birds' nests and evicts the existing eggs. Same energy.

From Cuckoo Hash Table to Cuckoo Filter

The trick is computing the second bucket location from just the fingerprint, without storing the original element. Cuckoo filters use partial-key cuckoo hashing:

bucket1 = hash(element)
bucket2 = bucket1 XOR hash(fingerprint)

Insert

Compute fingerprint f = hash(element).
Compute bucket1 = hash(element) and bucket2 = bucket1 XOR hash(f).
If bucket1 or bucket2 has an empty slot, store f there.
If both are full, pick one, evict a random existing fingerprint, and relocate it to its alternative bucket (computed via XOR).
Repeat relocations until an empty slot is found or a maximum number of kicks is reached (typically 500).

Lookup

Compute fingerprint f = hash(element).
Compute bucket1 = hash(element) and bucket2 = bucket1 XOR hash(f).
Check both buckets for a matching fingerprint. If found, return "probably present." If not, return "definitely not present."

Delete

Compute fingerprint and both bucket locations (same as lookup).
Find a matching fingerprint in either bucket and remove it.

Space Efficiency

The false positive rate of a cuckoo filter depends on the fingerprint size f and the bucket size b (number of entries per bucket):

FPR ≈ 1 / (2^f × (2b/3))

With 4-way buckets (b=4) and 12-bit fingerprints:

FPR ≈ 1 / (4096 × 2.67) ≈ 0.009%

With 8-bit fingerprints and 4-way buckets:

FPR ≈ 1 / (256 × 2.67) ≈ 0.15%

Space per element = f / occupancy. With 95% occupancy and 12-bit fingerprints: 12 / 0.95 ≈ 12.6 bits per element. Bloom filters at the same FPR use about 14.4 bits per element. Cuckoo filters win.

4-Way Buckets: Why More Entries Per Bucket Helps

Four entries per bucket is the sweet spot. Eight entries give marginal improvement (96-97% occupancy) but hurt cache performance because each lookup reads a larger memory region.

Production Use: DPDK Packet Classification

Ribbon Filters: The Next Generation

When to Choose What

Cuckoo filter. Workloads that need deletion, FPR targets between 0.1% and 3%, or situations where worst-case lookup performance matters (always exactly 2 memory accesses).

Counting Bloom filter. When you need deletion but your code already uses Bloom filters extensively and you do not want to switch data structures. Accept the 4x memory penalty.

Ribbon filter. Storage engine internals where filters are built once and queried billions of times. The space savings compound over millions of SSTables.

Architecture

The Bloom Filter Gap

Cuckoo Hashing: The Foundation

From Cuckoo Hash Table to Cuckoo Filter

Insert

Lookup

Delete

Space Efficiency

4-Way Buckets: Why More Entries Per Bucket Helps

Production Use: DPDK Packet Classification

Ribbon Filters: The Next Generation

When to Choose What

Key Points

Used By

Common Mistakes

Related

Cuckoo Hashing & Cuckoo Filters

Architecture

The Bloom Filter Gap

Cuckoo Hashing: The Foundation

From Cuckoo Hash Table to Cuckoo Filter

Insert

Lookup

Delete

Space Efficiency

4-Way Buckets: Why More Entries Per Bucket Helps

Production Use: DPDK Packet Classification

Ribbon Filters: The Next Generation

When to Choose What

Key Points

Used By

Common Mistakes

Related