Connection Pooling & Keep-Alive
Connection pooling amortizes the cost of TCP and TLS setup across thousands of requests by reusing pre-established connections.
The Problem
How can an application avoid paying the 200-300ms cost of TCP + TLS connection establishment for every single request when handling thousands of requests per second?
Mental Model
Like a car rental fleet — pre-warmed cars ready to go instead of manufacturing a new one each time. Pick one up, drive it, return it, and someone else uses it next.
Architecture Diagram
How It Works
Creating a TCP connection is expensive. Here's the real cost breakdown for a typical HTTPS request to a server 50ms away:
| Step | Time | Cumulative |
|---|---|---|
| DNS resolution | 20ms | 20ms |
| TCP 3-way handshake | 50ms (1 RTT) | 70ms |
| TLS 1.3 handshake | 50ms (1 RTT) | 120ms |
| Slow start ramp-up | ~100ms to reach full speed | 220ms |
| Actual request + response | 10ms | 230ms |
That 10ms database query took 230ms because of connection overhead. At 100 queries per second, that's 22 seconds of cumulative setup time wasted every second. This is why connection pooling exists.
A connection pool pre-establishes a set of TCP connections and keeps them alive. When the application needs a connection, it borrows one from the pool (sub-millisecond). When done, it returns the connection to the pool. The TCP + TLS handshake cost is paid once, amortized across thousands of requests.
Pool Lifecycle
Every connection in a pool goes through these states:
[Created] → [Idle in Pool] → [Borrowed by App] → [In Use] → [Returned to Pool] → [Idle in Pool]
↓ ↓
[Evicted (idle too long)] [Evicted (lifetime exceeded)]
↓ ↓
[TCP FIN → Closed] [TCP FIN → Closed]
Creation: The pool establishes a TCP connection (and TLS if needed). Some pools pre-warm a minimum number of connections at startup; others create lazily on first demand.
Borrowing: The application requests a connection. The pool returns an idle one (O(1) operation). If none are available and the pool is at max size, the request blocks until one is returned or a timeout fires.
Validation: Before handing over the connection, the pool can optionally run a health check (SELECT 1 for databases, or just check the socket state). This catches dead connections that look alive because the TCP FIN was never received.
Return: The application returns the connection. The pool resets any state (transaction rollback for databases) and marks it idle.
Eviction: Connections that exceed their idle timeout or maximum lifetime are closed. This prevents issues with firewalls, NATs, and load balancers that silently drop idle connections.
HTTP Keep-Alive
HTTP/1.0 closed the TCP connection after every request-response pair. HTTP/1.1 changed this with persistent connections (keep-alive): the connection stays open for subsequent requests.
HTTP/1.0 (no keep-alive):
[TCP Handshake] → GET /page → Response → [TCP Close]
[TCP Handshake] → GET /style.css → Response → [TCP Close]
[TCP Handshake] → GET /script.js → Response → [TCP Close]
Cost: 3 handshakes
HTTP/1.1 (keep-alive):
[TCP Handshake] → GET /page → Response → GET /style.css → Response → GET /script.js → Response
Cost: 1 handshake
In HTTP/1.1, keep-alive is the default. Disabling it requires explicitly sending Connection: close. But HTTP/1.1 still has a limitation: requests are serialized on each connection. A second request cannot be sent until the first response is complete (head-of-line blocking). Browsers work around this by opening 6 parallel connections per domain.
HTTP/2 solved this with multiplexing: multiple concurrent request-response streams on a single connection. One TCP connection handles everything, making connection pooling even more efficient.
# Check if keep-alive is working with curl
curl -v --http1.1 https://example.com 2>&1 | grep -i "connection"
# Look for: Connection: keep-alive
# NOT: Connection: close
# nginx keep-alive configuration
# upstream keepalive — pool of persistent connections to backends
upstream backend {
server 10.0.1.50:8080;
keepalive 32; # Keep 32 idle connections to this upstream
}
Database Connection Pooling
Database connections are even more expensive than HTTP connections. A PostgreSQL connection, for example:
- TCP handshake (1 RTT)
- TLS handshake if using SSL (1 RTT)
- PostgreSQL authentication handshake (1-2 RTTs)
- PostgreSQL forks a new backend process (~10-50ms)
- Backend process allocates memory (~5-10 MB per connection)
That's 3-5 RTTs plus process creation. On a cloud database with 10ms RTT, that's 50-80ms per new connection. PostgreSQL is limited to a few hundred connections before memory exhaustion becomes a problem (each backend process takes 5-10 MB).
HikariCP Configuration (JVM)
HikariConfig config = new HikariConfig();
config.setJdbcUrl("jdbc:postgresql://db:5432/myapp");
config.setMaximumPoolSize(20); // Max connections in pool
config.setMinimumIdle(5); // Keep at least 5 ready
config.setIdleTimeout(300000); // Close idle after 5 min
config.setMaxLifetime(1800000); // Recycle after 30 min
config.setConnectionTimeout(10000); // Wait max 10s for connection
config.setLeakDetectionThreshold(60000); // Warn if held > 60s
HikariDataSource ds = new HikariDataSource(config);
PgBouncer: The PostgreSQL Multiplexer
PgBouncer sits between the application and PostgreSQL, multiplexing many client connections onto a small number of server connections. Three pooling modes:
- Session pooling: Client gets a dedicated server connection for the entire session. Safest, least efficient.
- Transaction pooling: Client gets a server connection for one transaction, then it's returned. Most common in production.
- Statement pooling: Client gets a server connection for one statement. Most efficient but breaks multi-statement transactions.
; pgbouncer.ini
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000 ; Accept up to 1000 client connections
default_pool_size = 20 ; But only use 20 PostgreSQL connections
reserve_pool_size = 5 ; Extra 5 for burst traffic
server_idle_timeout = 300 ; Close idle server connections after 5 min
This is essential for serverless architectures. AWS Lambda can spin up hundreds of instances simultaneously, each wanting a database connection. Without PgBouncer (or RDS Proxy), PostgreSQL's connection limit gets exhausted instantly.
Pool Sizing: The Science
The most common question is "how big should my pool be?" The answer involves Little's Law:
L = λ × W
Where:
L = average number of connections in use
λ = request arrival rate (requests per second)
W = average time a connection is held (seconds)
If a service handles 1000 requests/second and each request holds a database connection for 5ms:
L = 1000 × 0.005 = 5 connections
That's only 5 connections! Add headroom for variance (2-3x), and 10-15 connections handle the load. The common mistake is setting the pool to 100+ "just in case" and then exhausting database resources.
PostgreSQL's maintainer, the HikariCP author, and every database expert agrees: a smaller pool with a wait queue outperforms a larger pool because of reduced context switching, lock contention, and memory usage on the database server.
The Formula
For most workloads:
pool_size = (core_count * 2) + effective_spindle_count
For SSD-backed databases with 4 CPU cores: (4 * 2) + 1 = 9 connections. Yes, really. Nine connections can handle thousands of concurrent requests if each query takes single-digit milliseconds.
Timeout Alignment: The Silent Killer
The most insidious connection pooling bug is timeout mismatch. Every layer in the stack has its own idle timeout:
| Layer | Default Timeout | What Happens |
|---|---|---|
| Application pool (HikariCP) | 10 minutes | Closes connection after idle period |
| PgBouncer / ProxySQL | 5 minutes | Kills idle server connections |
| Cloud load balancer (AWS ALB) | 60 seconds | Resets idle TCP connections |
| Firewall / NAT | 5 minutes (varies) | Silently drops idle connections |
| Database server | Infinite (PostgreSQL) | Never closes on its own |
If HikariCP thinks a connection is alive (idle timeout 10 min) but the ALB killed it at 60 seconds, the next request on that connection gets a Connection reset by peer error. The fix: set the application pool's idle timeout to be shorter than the most aggressive intermediate timeout.
# Find the most restrictive timeout in the path
# Check ALB idle timeout
aws elbv2 describe-target-group-attributes --target-group-arn $ARN | grep idle
# Check PostgreSQL timeout
psql -c "SHOW tcp_keepalives_idle;"
# Check Linux TCP keep-alive
sysctl net.ipv4.tcp_keepalive_time
gRPC and HTTP/2: Built-in Pooling
gRPC uses HTTP/2 persistent connections with multiplexed streams. A single gRPC connection between two services can handle thousands of concurrent RPCs. This is effectively a connection pool of size 1, but with multiplexing that makes it equivalent to hundreds of HTTP/1.1 connections.
// Go gRPC client — one connection, many concurrent RPCs
conn, _ := grpc.Dial("service:50051",
grpc.WithTransportCredentials(creds),
grpc.WithKeepaliveParams(keepalive.ClientParameters{
Time: 10 * time.Second, // Ping every 10s if idle
Timeout: 3 * time.Second, // Wait 3s for pong
PermitWithoutStream: true, // Ping even with no active RPCs
}),
)
// All RPCs share this single connection
client := pb.NewMyServiceClient(conn)
The keep-alive pings serve as health checks, ensuring the connection is alive and all intermediate proxies/load balancers keep the connection in their state tables.
Production Checklist
Before deploying connection pooling:
- Set pool size based on Little's Law, not gut feeling. Measure the actual query duration and request rate.
- Align timeouts across all layers: application pool, proxy, load balancer, firewall, database.
- Enable leak detection to catch code paths that borrow connections but never return them.
- Monitor pool metrics: active/idle/total connections, wait time, timeout count, creation rate.
- Test failover: Kill the database and verify the pool detects dead connections and creates new ones to the failover.
- Warm up gradually: Don't create all connections at once on startup — it can overwhelm the database.
Connection pooling is boring infrastructure work. But it's the difference between a 10ms API response and a 230ms one. Get it right once, and it fades into the background. Get it wrong, and mysterious connection resets will surface at 3 AM.
Key Points
- •A new TCP connection costs 1 RTT (handshake) + 1-2 RTT (TLS) + slow start ramp-up. On a 100ms link, that's 200-300ms before data flows at full speed
- •HTTP/1.1 Keep-Alive was the first step: reuse the TCP connection for sequential requests. HTTP/2 took it further with multiplexed concurrent requests on one connection
- •Database connection pools (HikariCP, pgbouncer) are critical because database handshakes are even more expensive than HTTP — PostgreSQL's fork-per-connection model makes this essential
- •Pool sizing is a balance: too few connections causes queuing, too many exhausts server resources. Little's Law (L = lambda * W) is the guide
- •Connection health checking prevents borrowing dead connections. A stale connection that fails on first use is worse than creating a new one
Key Components
| Component | Role |
|---|---|
| Connection Pool | A cache of pre-established TCP connections ready for immediate reuse, avoiding handshake overhead |
| Keep-Alive | HTTP mechanism that holds TCP connections open between requests instead of closing after each response |
| Pool Lifecycle Manager | Handles connection creation, validation, borrowing, returning, health checking, and eviction |
| Idle Timeout | Evicts connections that haven't been used within a threshold to free up resources on both client and server |
| Max Pool Size | Caps the number of open connections to prevent resource exhaustion on either side |
When to Use
Always. Every production system should use connection pooling for database connections, HTTP client connections, and service-to-service calls. The only exception is truly one-shot scripts that make a single request.
Tool Comparison
| Tool | Type | Best For | Scale |
|---|---|---|---|
| HikariCP | Open Source | JVM database connection pooling — fastest, most battle-tested pool for Java/Kotlin | Any |
| PgBouncer | Open Source | PostgreSQL connection pooling proxy — essential for serverless and high-connection-count environments | Medium-Enterprise |
| ProxySQL | Open Source | MySQL/MariaDB connection pooling with query routing, caching, and read/write splitting | Medium-Enterprise |
| Envoy Proxy | Open Source | HTTP/gRPC connection pooling for service mesh with circuit breaking and outlier detection | Large-Enterprise |
Debug Checklist
- Monitor pool metrics: active connections, idle connections, wait time, timeout count. HikariCP and PgBouncer both expose these via metrics endpoints
- Check for connection leaks: if active connections grow monotonically without returning, code paths are not releasing connections (use leak detection features)
- Verify timeout alignment: if the pool's idle timeout is 30 minutes but the load balancer kills connections at 5 minutes, expect connection resets
- Watch for connection storms at startup: if all connections are created simultaneously, the database may reject them. Use lazy initialization or gradual warm-up
- Test failover behavior: when a database fails over, pooled connections to the old primary are dead. The pool must detect this and create new connections to the new primary
Common Mistakes
- Setting max pool size equal to max threads. With 200 threads and 200 DB connections, connections are held during CPU work. 20-30 connections typically serve 200 threads
- Not configuring idle timeout. Connections sitting idle for minutes get killed by firewalls, NATs, or load balancers — then the app gets a surprise 'connection reset'
- Leaking connections by not returning them to the pool in error paths. Always use try-finally (or equivalent) to guarantee connection return
- Ignoring connection validation. A TCP connection can be half-closed without either side knowing. Validate before use with a lightweight query (SELECT 1)
- Using default pool settings in production. Every database, cloud provider, and load balancer has different timeout defaults — they must be aligned
Real World Usage
- •Amazon RDS Proxy pools database connections for Lambda functions, solving the serverless-to-database connection explosion problem
- •Envoy in Istio service mesh maintains connection pools to upstream services with configurable max connections, circuit breaking, and health checking
- •nginx upstream keepalive maintains a pool of persistent connections to backend servers, reducing latency by 50-80% for proxied requests
- •gRPC uses persistent HTTP/2 connections with multiplexed streams — effectively a built-in connection pool with no additional configuration
- •Redis connection pools in applications like Sidekiq prevent connection storms when many workers start simultaneously