HTTP/2 — Multiplexing Revolution — Networking Protocols | CrackingWalnuts

How It Works

HTTP/2 solves HTTP/1.1's fundamental performance problem: head-of-line blocking at the application layer. Instead of sending one request and waiting for its response before sending the next, HTTP/2 introduces a binary framing layer that breaks all HTTP messages into small frames, tags them with a stream ID, and interleaves them over a single TCP connection.

The key abstraction is the stream — a bidirectional sequence of frames between client and server. Each request-response pair gets its own stream. Streams are independent: if stream 3 is waiting for a database query, streams 5, 7, and 9 can still send and receive data. This is multiplexing, and it's the single biggest improvement over HTTP/1.1.

Binary Framing — The Foundation

Every HTTP/2 message is split into frames:

+-----------------------------------------------+
|                Length (24 bits)                 |
+---------------+---------------+---------------+
|  Type (8)     |  Flags (8)    |
+-+-------------+---------------+
|R|         Stream ID (31 bits)                  |
+=+=============================================+
|              Frame Payload (variable)          |
+-----------------------------------------------+

Frame types include:

HEADERS — carries HTTP headers (compressed via HPACK)
DATA — carries the response/request body
SETTINGS — connection-level configuration
WINDOW_UPDATE — flow control adjustments
PUSH_PROMISE — server push announcements
RST_STREAM — cancel a single stream without killing the connection
GOAWAY — graceful connection shutdown

The binary format is both a blessing and a curse. It can't be debugged with telnet anymore, but it's dramatically more efficient to parse than HTTP/1.1's text format.

HPACK Header Compression

HTTP/1.1 headers are sent as plain text on every request. For a typical API call, headers can be 500-800 bytes — often larger than the actual payload. Multiply that by hundreds of requests per page load, and the waste is significant.

HPACK fixes this with two mechanisms:

Static Table — 61 pre-defined header entries (:method: GET, :status: 200, etc.) referenced by index
Dynamic Table — connection-specific table that grows as new headers are seen

# First request — full headers
:method: GET
:path: /api/users
authorization: Bearer eyJhbGciOiJIUzI1NiJ9...
accept: application/json
x-request-id: abc-123

# Second request — most headers indexed
:method: GET         → index 2 (static table)
:path: /api/orders   → literal (new path)
authorization: ...   → index 62 (dynamic table, same token)
accept: ...          → index 63 (dynamic table)
x-request-id: def-456 → literal (new value)

The result: 85-90% reduction in header bytes after the first few requests on a connection. This is particularly impactful for mobile clients on bandwidth-constrained networks.

Server Push — The Promise That Didn't Deliver

Server push allows the server to send resources before the client asks for them. When the server knows that /index.html will need /style.css and /app.js, it can push them immediately.

The idea was compelling. The reality was disappointing:

Cache invalidation — the server doesn't know what the client already has cached. It pushes resources the client doesn't need.
Bandwidth waste — pushed resources compete with resources the client actually requested.
Complexity — getting push right requires intimate knowledge of the client's cache state.

Chrome removed server push support in 2022. The industry moved to 103 Early Hints instead — the server sends Link headers suggesting resources to preload, and the client decides whether to fetch them.

Stream Priorities and Flow Control

Not all resources are equal. The CSS needed for rendering is more important than a tracking pixel. HTTP/2 includes a priority system where clients can assign:

Weight (1-256) — relative importance among sibling streams
Dependency — parent-child relationships creating a priority tree

         Stream 1 (HTML)
         /              \
   Stream 3 (CSS)    Stream 5 (JS)
   weight: 256        weight: 128
        |
   Stream 7 (Font)
   weight: 200

In theory, the server uses this tree to allocate bandwidth. In practice, server implementations range from "fully respects priorities" (H2O) to "completely ignores them" (many Nginx versions). Chrome switched from the tree model to a simpler scheme, and HTTP/3 uses a different priority system entirely.

Flow control operates at two levels: per-stream and per-connection. Each has a window size (default 65,535 bytes) that the receiver adjusts with WINDOW_UPDATE frames. This prevents a fast sender from overwhelming a slow receiver and ensures one stream can't monopolize the connection.

The TCP Head-of-Line Blocking Irony

Here's the irony that led to HTTP/3: HTTP/2 solved application-layer head-of-line blocking but made TCP-layer blocking worse.

With HTTP/1.1's six connections, a packet loss on one connection only blocks requests on that connection. With HTTP/2's single connection, a single lost packet blocks all streams until TCP retransmits it. Under packet loss rates above 2%, HTTP/2 can actually perform worse than HTTP/1.1.

HTTP/1.1: 6 connections → packet loss affects 1/6 of requests
HTTP/2:   1 connection  → packet loss affects ALL requests

This is why QUIC (HTTP/3) moved to UDP — it implements its own reliability per-stream, so a lost packet in one stream doesn't block others.

Migration Checklist

Moving to HTTP/2 is mostly transparent because the semantics (methods, headers, status codes) are identical to HTTP/1.1. Here's what to watch:

Enable TLS — browsers require it for HTTP/2 (ALPN negotiation happens during the TLS handshake)
Remove domain sharding — consolidate assets to a single origin to maximize multiplexing
Remove concatenation hacks — sprite sheets and JS bundles can be split into individual files
Stop inlining small resources — they can be served as separate streams now
Verify the CDN/load balancer supports end-to-end HTTP/2 — many only terminate it at the edge

# Verify HTTP/2 support
curl -v --http2 https://example.com 2>&1 | grep "< HTTP/"
# Should show: < HTTP/2 200

# Check ALPN negotiation
openssl s_client -connect example.com:443 -alpn h2

HTTP/2 was a massive leap forward, but its reliance on TCP left one critical problem unsolved. That's where HTTP/3 and QUIC enter the picture.

Component	Role
Binary Framing Layer	Encodes all HTTP messages into binary frames with type, stream ID, and flags
Streams	Independent bidirectional sequences of frames within a single TCP connection
HPACK Compression	Stateful header compression using static/dynamic tables to eliminate redundant header bytes
Server Push	Allows the server to proactively send resources before the client requests them
Flow Control	Per-stream and per-connection windowing to prevent fast senders from overwhelming slow receivers

Tool	Type	Best For	Scale
Nginx	Open Source	HTTP/2 termination and reverse proxying with battle-tested performance	Millions of concurrent connections
Envoy Proxy	Open Source	HTTP/2 in service mesh environments with advanced observability	Cloud-native microservice architectures
Cloudflare	Managed	Automatic HTTP/2 at the edge with zero server-side config	Global CDN scale
HAProxy	Open Source	High-performance HTTP/2 load balancing with fine-grained control	Enterprise load balancing

How It Works

Binary Framing — The Foundation

Every HTTP/2 message is split into frames:

+-----------------------------------------------+
|                Length (24 bits)                 |
+---------------+---------------+---------------+
|  Type (8)     |  Flags (8)    |
+-+-------------+---------------+
|R|         Stream ID (31 bits)                  |
+=+=============================================+
|              Frame Payload (variable)          |
+-----------------------------------------------+

Frame types include:

HEADERS — carries HTTP headers (compressed via HPACK)
DATA — carries the response/request body
SETTINGS — connection-level configuration
WINDOW_UPDATE — flow control adjustments
PUSH_PROMISE — server push announcements
RST_STREAM — cancel a single stream without killing the connection
GOAWAY — graceful connection shutdown

The binary format is both a blessing and a curse. It can't be debugged with telnet anymore, but it's dramatically more efficient to parse than HTTP/1.1's text format.

HPACK Header Compression

HPACK fixes this with two mechanisms:

Static Table — 61 pre-defined header entries (:method: GET, :status: 200, etc.) referenced by index
Dynamic Table — connection-specific table that grows as new headers are seen

# First request — full headers
:method: GET
:path: /api/users
authorization: Bearer eyJhbGciOiJIUzI1NiJ9...
accept: application/json
x-request-id: abc-123

# Second request — most headers indexed
:method: GET         → index 2 (static table)
:path: /api/orders   → literal (new path)
authorization: ...   → index 62 (dynamic table, same token)
accept: ...          → index 63 (dynamic table)
x-request-id: def-456 → literal (new value)

The result: 85-90% reduction in header bytes after the first few requests on a connection. This is particularly impactful for mobile clients on bandwidth-constrained networks.

Server Push — The Promise That Didn't Deliver

Server push allows the server to send resources before the client asks for them. When the server knows that /index.html will need /style.css and /app.js, it can push them immediately.

The idea was compelling. The reality was disappointing:

Cache invalidation — the server doesn't know what the client already has cached. It pushes resources the client doesn't need.
Bandwidth waste — pushed resources compete with resources the client actually requested.
Complexity — getting push right requires intimate knowledge of the client's cache state.

Stream Priorities and Flow Control

Not all resources are equal. The CSS needed for rendering is more important than a tracking pixel. HTTP/2 includes a priority system where clients can assign:

Weight (1-256) — relative importance among sibling streams
Dependency — parent-child relationships creating a priority tree

         Stream 1 (HTML)
         /              \
   Stream 3 (CSS)    Stream 5 (JS)
   weight: 256        weight: 128
        |
   Stream 7 (Font)
   weight: 200

The TCP Head-of-Line Blocking Irony

Here's the irony that led to HTTP/3: HTTP/2 solved application-layer head-of-line blocking but made TCP-layer blocking worse.

HTTP/1.1: 6 connections → packet loss affects 1/6 of requests
HTTP/2:   1 connection  → packet loss affects ALL requests

This is why QUIC (HTTP/3) moved to UDP — it implements its own reliability per-stream, so a lost packet in one stream doesn't block others.

Migration Checklist

Moving to HTTP/2 is mostly transparent because the semantics (methods, headers, status codes) are identical to HTTP/1.1. Here's what to watch:

Enable TLS — browsers require it for HTTP/2 (ALPN negotiation happens during the TLS handshake)
Remove domain sharding — consolidate assets to a single origin to maximize multiplexing
Remove concatenation hacks — sprite sheets and JS bundles can be split into individual files
Stop inlining small resources — they can be served as separate streams now
Verify the CDN/load balancer supports end-to-end HTTP/2 — many only terminate it at the edge

# Verify HTTP/2 support
curl -v --http2 https://example.com 2>&1 | grep "< HTTP/"
# Should show: < HTTP/2 200

# Check ALPN negotiation
openssl s_client -connect example.com:443 -alpn h2

HTTP/2 was a massive leap forward, but its reliance on TCP left one critical problem unsolved. That's where HTTP/3 and QUIC enter the picture.

HTTP/2 — Multiplexing Revolution

The Problem

Mental Model

Architecture Diagram

How It Works

Binary Framing — The Foundation

HPACK Header Compression

Server Push — The Promise That Didn't Deliver

Stream Priorities and Flow Control

The TCP Head-of-Line Blocking Irony

Migration Checklist

Key Points

Key Components

When to Use

Tool Comparison

Debug Checklist

Common Mistakes

Real World Usage

RFCs & Specs

Related Topics

HTTP/2 — Multiplexing Revolution

The Problem

Mental Model

Architecture Diagram

How It Works

Binary Framing — The Foundation

HPACK Header Compression

Server Push — The Promise That Didn't Deliver

Stream Priorities and Flow Control

The TCP Head-of-Line Blocking Irony

Migration Checklist

Key Points

Key Components

When to Use

Tool Comparison

Debug Checklist

Common Mistakes

Real World Usage

RFCs & Specs

Related Topics