ZSTD Compression (Zstandard)

The Problem: Not Everything Is Time-Series Data

Gorilla compression achieves 12x on time-series samples because it exploits two domain-specific properties: regular timestamp intervals and slowly-changing float values. But not all data in an observability platform is sequential time-series samples.

Downsampled metric aggregates (min, max, avg, sum, count) are raw float64 doubles with no sequential relationship. Parquet column blocks contain mixed data types. Kafka message payloads are protobuf-encoded structs. Flink checkpoint state is serialized RocksDB entries. WAL segments are append-only binary logs. None of these have the structure that Gorilla needs.

These workloads need a general-purpose compressor that works on arbitrary bytes, compresses well, and decompresses fast. ZSTD fills that gap.

How ZSTD Works: Three Stages

ZSTD processes input data through three stages: find repeated patterns, encode them as compact sequences, then compress those sequences with entropy coding. Once you see how each stage works, the compression level tradeoffs start to make sense.

Stage 1: LZ77-Style Dictionary Matching

The first stage slides a window over the input, looking for byte sequences that appeared earlier. When it finds a match, it replaces the repeated bytes with a back-reference: "copy N bytes from M positions back."

Input string (84 bytes):
  "service=checkout,method=POST,status=200|service=checkout,method=GET,status=200"

Stage 1 finds matches:
  Position 0-40:  "service=checkout,method=POST,status=200|"  → stored as literals (first occurrence)
  Position 41-58: "service=checkout,"                          → match! (offset=41, length=17)
  Position 58-68: "method=GET,"                                → partial: "method=" matches (offset=35, length=7), "GET," is literal
  Position 68-78: "status=200"                                 → match! (offset=31, length=10)

Result: 84 bytes → ~55 bytes of literals + back-references

The compression level controls how hard this stage searches for matches. Level 1 uses a fast hash table with limited look-back. Level 19 uses a full optimal parser that considers every possible match combination. More searching = better matches = smaller output = more CPU time.

Stage 2: Sequence Encoding

Stage 1 produces a stream of interleaved literals and matches. Stage 2 structures this into a compact format:

Each "sequence" is a triple:
  (literal_length, offset, match_length)

From the example above:
  Sequence 1: (40 literals, -, -)         → the first 40 bytes verbatim
  Sequence 2: (0 literals, offset=41, match_length=17)  → copy "service=checkout,"
  Sequence 3: (4 literals "GET,", offset=35, match_length=7)   → "method=" copied, "GET," literal
  Sequence 4: (0 literals, offset=31, match_length=10)  → copy "status=200"

This sequence representation is more compressible than the raw back-references because the literal lengths, offsets, and match lengths each follow predictable distributions (short matches are common, large offsets are rare).

Stage 3: Entropy Coding (FSE + Huffman)

The final stage compresses the sequence stream using two entropy coders:

Huffman coding compresses the literal bytes. In the example, the characters s, e, =, , appear far more often than P, O, G. Huffman assigns shorter bit codes to frequent characters and longer codes to rare ones. If e appears 10% of the time, it gets ~3.3 bits instead of 8.

Finite State Entropy (FSE) compresses the match lengths and offsets. This is the part that really sets ZSTD apart from older algorithms. FSE achieves the same compression as arithmetic coding (the theoretical optimum) but decodes with a single table lookup per symbol instead of a division. That single difference is why ZSTD decompresses at 1500 MB/s while gzip (which uses Huffman for everything) tops out at 300 MB/s.

The full pipeline on our 84-byte example:

  Input:  84 bytes (raw string)
  After Stage 1 (matching):    ~55 bytes (literals + back-references)
  After Stage 2 (sequences):   ~45 bytes (structured triples)
  After Stage 3 (entropy):     ~28 bytes (bit-optimal encoding)

  Compression ratio: 84 / 28 = 3.0x

On larger inputs, ZSTD achieves better ratios because the matching window is bigger (up to 128 KB at level 1, up to 8 MB at level 19) and the entropy coder has more data to learn symbol frequencies from.

Compression Levels: The Speed-Ratio Tradeoff

ZSTD's 22 compression levels control how hard Stage 1 searches for matches. Decompression speed stays constant regardless of level because the compressed format is the same. Only the search effort during compression changes.

Level	Compression Speed	Decompression Speed	Ratio (Silesia benchmark)	Use Case
1	~500 MB/s	~1500 MB/s	~2.9x	Real-time: Kafka messages, network transfer
3 (default)	~400 MB/s	~1500 MB/s	~3.3x	General purpose: Parquet blocks, log files
6	~150 MB/s	~1500 MB/s	~3.6x	Warm-tier storage: balanced speed and ratio
9	~80 MB/s	~1500 MB/s	~3.8x	Cold-tier: write-once archival
15	~15 MB/s	~1500 MB/s	~4.1x	Deep archival: maximize compression
19	~5 MB/s	~1500 MB/s	~4.3x	Maximum compression (rarely worth the CPU)
22	~2 MB/s	~1500 MB/s	~4.4x	Extreme: diminishing returns beyond level 19

Notice that decompression is always ~1500 MB/s. You pay the CPU cost of high compression levels only once (on write), but every read benefits from fast decompression. For write-once-read-many workloads like cold-tier metric archives and Parquet files, that tradeoff is well worth it.

Dictionary Mode: Compressing Small Data

Standard ZSTD needs enough input bytes to find patterns. On a 1 KB Kafka message, there simply are not enough repeated sequences for the matching stage to work with. You end up with a poor ~1.5x ratio, barely worth the CPU.

Dictionary mode solves this. You train a dictionary on a representative sample of your data:

Training (offline, one-time):
  1. Collect 1000 sample Kafka messages (e.g., metric protobuf payloads)
  2. zstd --train samples/ -o metric_dict --maxdict=100KB
  3. ZSTD analyzes common byte patterns across all samples
  4. Produces a 100 KB dictionary of shared context

Compression (runtime):
  Without dictionary:  2 KB message → 1.3 KB (1.5x)   (not worth it)
  With dictionary:     2 KB message → 500 bytes (4.0x)  (significant savings)

  The dictionary provides the "history" that a small message lacks.
  ZSTD matches input bytes against the dictionary as if they were
  preceded by 100 KB of representative data.

Kafka supports ZSTD compression natively (producer config: compression.type=zstd). For dictionary mode, the application must manage dictionary distribution (ship the same dictionary to all producers and consumers). This complexity is worth it when compressing millions of small messages per second.

ZSTD vs Other Compressors

Algorithm	Compression Speed	Decompression Speed	Ratio (Silesia)	Year	Best For
LZ4	~750 MB/s	~4000 MB/s	~2.1x	2011	Hot path where speed is paramount (WAL, memtable flush)
Snappy	~500 MB/s	~1500 MB/s	~2.1x	2011	Legacy systems, Kafka default before ZSTD
gzip	~30 MB/s	~300 MB/s	~3.2x	1992	HTTP content encoding, backward compatibility
ZSTD	~400 MB/s	~1500 MB/s	~3.3x	2016	General purpose (replaced gzip in most new systems)
brotli	~20 MB/s	~400 MB/s	~3.6x	2015	Static web assets (HTML, CSS, JS)

Why did ZSTD win? It matches gzip's compression ratio while decompressing 5x faster and compressing 10x faster. That made it the natural default for Kafka (replacing Snappy), Parquet (replacing gzip), and RocksDB (replacing LZ4/Snappy depending on tier).

LZ4 still wins when decompression speed is the only priority (4000 MB/s vs ZSTD's 1500 MB/s). RocksDB uses LZ4 for its hot-tier SST files and ZSTD for cold-tier compacted files, a common hot/cold tiering pattern.

Where ZSTD Appears in Data Infrastructure

ZSTD shows up at every layer below the hot path in modern data systems:

Columnar storage (Parquet, ORC). Parquet compresses each column block with ZSTD. Aggregated data (pre-computed doubles like min, max, avg, sum, count) has no sequential relationship between rows, so domain-specific compressors like Gorilla cannot help. ZSTD gives ~3x on these float64 arrays and is now the standard compression codec for data lake files on S3/GCS.

Message brokers (Kafka, Pulsar). Producers compress message batches with ZSTD level 1 before sending. Brokers store compressed batches on disk. Consumers decompress on read. At high throughput, ZSTD reduces network bandwidth by ~4x with minimal CPU overhead at level 1.

Embedded storage engines (RocksDB, LevelDB). RocksDB compresses SST files at the block level. Hot tiers (L0/L1) typically use LZ4 for fast read/write. Cold tiers (L2+) use ZSTD for better compression on compacted data read less frequently. Any system built on RocksDB follows this tiered pattern: Flink state backends, CockroachDB, TiKV.

Trace and log storage. Systems that store traces or logs as Parquet files on object storage use ZSTD for column block compression. Variable-length strings (service names, operation names, attribute keys) compress well because repeated values create strong matching opportunities for Stage 1.

ZSTD vs Gorilla: Different Tools for Different Data

Gorilla and ZSTD are not competing. They handle different data at different tiers.

                    Gorilla                          ZSTD
                    -------                          ----
Input:              Sequential time-series samples   Arbitrary bytes
Exploits:           Regular intervals, slow changes  Repeated byte patterns
Ratio:              ~12x on regular metrics          ~3x on float64 arrays
Speed:              O(1) per sample encode/decode    ~400 MB/s compress, ~1500 MB/s decompress
Works on Parquet:   No (aggregated doubles have       Yes (general-purpose,
                    no sequential relationship)       works on any byte stream)
Hot tier:           Yes (vmstorage raw samples)      No (too slow for ingest path)
Cold tier:          No (data is pre-aggregated)      Yes (Parquet blocks on S3)
Can stack:          ZSTD on top of Gorilla gives     N/A
                    another 2-3x for cold export
                    (25-35x total)

VictoriaMetrics uses exactly this layered approach: Gorilla in the hot path (vmstorage), and ZSTD-compressed Parquet for cold-tier S3 exports.

Limitations

CPU cost scales non-linearly with compression level. Going from level 1 to level 3 costs ~20% more CPU for ~15% better ratio. Going from level 3 to level 19 costs ~80x more CPU for only ~30% better ratio. Beyond level 9, diminishing returns hit hard. Most production systems stay at level 1-6.

No domain-specific optimization. ZSTD treats input as opaque bytes. If your data has exploitable structure (like time-series regularity), a domain-specific compressor like Gorilla will always win on that dimension. ZSTD is the best general-purpose option, not the best option for any specific data type.

Dictionary mode requires coordination. All producers and consumers must share the same dictionary. Dictionary changes require a rolling deployment. If a consumer receives data compressed with an unknown dictionary, decompression fails. This operational complexity means dictionary mode is only worth it for high-volume small-message workloads.

Memory usage scales with window size and compression level. Level 1 uses ~1 MB of memory per compression context. Level 19 uses ~100+ MB. For workloads compressing thousands of streams in parallel (like Kafka brokers compressing per-partition), memory usage at high levels is non-trivial.

The Problem: Not Everything Is Time-Series Data

These workloads need a general-purpose compressor that works on arbitrary bytes, compresses well, and decompresses fast. ZSTD fills that gap.

How ZSTD Works: Three Stages

Stage 1: LZ77-Style Dictionary Matching

Input string (84 bytes):
  "service=checkout,method=POST,status=200|service=checkout,method=GET,status=200"

Stage 1 finds matches:
  Position 0-40:  "service=checkout,method=POST,status=200|"  → stored as literals (first occurrence)
  Position 41-58: "service=checkout,"                          → match! (offset=41, length=17)
  Position 58-68: "method=GET,"                                → partial: "method=" matches (offset=35, length=7), "GET," is literal
  Position 68-78: "status=200"                                 → match! (offset=31, length=10)

Result: 84 bytes → ~55 bytes of literals + back-references

Stage 2: Sequence Encoding

Stage 1 produces a stream of interleaved literals and matches. Stage 2 structures this into a compact format:

Each "sequence" is a triple:
  (literal_length, offset, match_length)

From the example above:
  Sequence 1: (40 literals, -, -)         → the first 40 bytes verbatim
  Sequence 2: (0 literals, offset=41, match_length=17)  → copy "service=checkout,"
  Sequence 3: (4 literals "GET,", offset=35, match_length=7)   → "method=" copied, "GET," literal
  Sequence 4: (0 literals, offset=31, match_length=10)  → copy "status=200"

Stage 3: Entropy Coding (FSE + Huffman)

The final stage compresses the sequence stream using two entropy coders:

The full pipeline on our 84-byte example:

  Input:  84 bytes (raw string)
  After Stage 1 (matching):    ~55 bytes (literals + back-references)
  After Stage 2 (sequences):   ~45 bytes (structured triples)
  After Stage 3 (entropy):     ~28 bytes (bit-optimal encoding)

  Compression ratio: 84 / 28 = 3.0x

Compression Levels: The Speed-Ratio Tradeoff

Level	Compression Speed	Decompression Speed	Ratio (Silesia benchmark)	Use Case
1	~500 MB/s	~1500 MB/s	~2.9x	Real-time: Kafka messages, network transfer
3 (default)	~400 MB/s	~1500 MB/s	~3.3x	General purpose: Parquet blocks, log files
6	~150 MB/s	~1500 MB/s	~3.6x	Warm-tier storage: balanced speed and ratio
9	~80 MB/s	~1500 MB/s	~3.8x	Cold-tier: write-once archival
15	~15 MB/s	~1500 MB/s	~4.1x	Deep archival: maximize compression
19	~5 MB/s	~1500 MB/s	~4.3x	Maximum compression (rarely worth the CPU)
22	~2 MB/s	~1500 MB/s	~4.4x	Extreme: diminishing returns beyond level 19

Dictionary Mode: Compressing Small Data

Dictionary mode solves this. You train a dictionary on a representative sample of your data:

Training (offline, one-time):
  1. Collect 1000 sample Kafka messages (e.g., metric protobuf payloads)
  2. zstd --train samples/ -o metric_dict --maxdict=100KB
  3. ZSTD analyzes common byte patterns across all samples
  4. Produces a 100 KB dictionary of shared context

Compression (runtime):
  Without dictionary:  2 KB message → 1.3 KB (1.5x)   (not worth it)
  With dictionary:     2 KB message → 500 bytes (4.0x)  (significant savings)

  The dictionary provides the "history" that a small message lacks.
  ZSTD matches input bytes against the dictionary as if they were
  preceded by 100 KB of representative data.

ZSTD vs Other Compressors

Algorithm	Compression Speed	Decompression Speed	Ratio (Silesia)	Year	Best For
LZ4	~750 MB/s	~4000 MB/s	~2.1x	2011	Hot path where speed is paramount (WAL, memtable flush)
Snappy	~500 MB/s	~1500 MB/s	~2.1x	2011	Legacy systems, Kafka default before ZSTD
gzip	~30 MB/s	~300 MB/s	~3.2x	1992	HTTP content encoding, backward compatibility
ZSTD	~400 MB/s	~1500 MB/s	~3.3x	2016	General purpose (replaced gzip in most new systems)
brotli	~20 MB/s	~400 MB/s	~3.6x	2015	Static web assets (HTML, CSS, JS)

Where ZSTD Appears in Data Infrastructure

ZSTD shows up at every layer below the hot path in modern data systems:

ZSTD vs Gorilla: Different Tools for Different Data

Gorilla and ZSTD are not competing. They handle different data at different tiers.

                    Gorilla                          ZSTD
                    -------                          ----
Input:              Sequential time-series samples   Arbitrary bytes
Exploits:           Regular intervals, slow changes  Repeated byte patterns
Ratio:              ~12x on regular metrics          ~3x on float64 arrays
Speed:              O(1) per sample encode/decode    ~400 MB/s compress, ~1500 MB/s decompress
Works on Parquet:   No (aggregated doubles have       Yes (general-purpose,
                    no sequential relationship)       works on any byte stream)
Hot tier:           Yes (vmstorage raw samples)      No (too slow for ingest path)
Cold tier:          No (data is pre-aggregated)      Yes (Parquet blocks on S3)
Can stack:          ZSTD on top of Gorilla gives     N/A
                    another 2-3x for cold export
                    (25-35x total)

VictoriaMetrics uses exactly this layered approach: Gorilla in the hot path (vmstorage), and ZSTD-compressed Parquet for cold-tier S3 exports.

Architecture

The Problem: Not Everything Is Time-Series Data

How ZSTD Works: Three Stages

Stage 1: LZ77-Style Dictionary Matching

Stage 2: Sequence Encoding

Stage 3: Entropy Coding (FSE + Huffman)

Compression Levels: The Speed-Ratio Tradeoff

Dictionary Mode: Compressing Small Data

ZSTD vs Other Compressors

Where ZSTD Appears in Data Infrastructure

ZSTD vs Gorilla: Different Tools for Different Data

Limitations

Key Points

Used By

Common Mistakes

Related

ZSTD Compression (Zstandard)

Architecture

The Problem: Not Everything Is Time-Series Data

How ZSTD Works: Three Stages

Stage 1: LZ77-Style Dictionary Matching

Stage 2: Sequence Encoding

Stage 3: Entropy Coding (FSE + Huffman)

Compression Levels: The Speed-Ratio Tradeoff

Dictionary Mode: Compressing Small Data

ZSTD vs Other Compressors

Where ZSTD Appears in Data Infrastructure

ZSTD vs Gorilla: Different Tools for Different Data

Limitations

Key Points

Used By

Common Mistakes

Related