Connection Pool
A pool of pre-established connections (database, HTTP, gRPC) reused across requests. Building a connection per request is too slow (TLS handshake, TCP setup, auth). Pool provides bounded resource usage, faster requests, and back-pressure when the limit is reached.
What it is
A connection pool maintains a set of pre-established connections to some downstream (database, HTTP service, gRPC service, message broker). When the application needs to make a call, it borrows a connection from the pool, uses it, returns it. The pool reuses connections across many requests.
The reason pools exist: connection establishment is expensive. TCP handshake (1 round trip), TLS handshake (1-2 round trips), auth (1+ round trips). For a database that's 10-50ms before any query runs. For HTTPS to a remote API, similar. Building a connection per request slows every request and ties up CPU on the server.
A pool gives near-zero connection overhead, bounded resource usage (no surprise spike of 10K connections to the database), and back-pressure when the limit is reached.
What goes in a pool
Pool size limits: max connections, min idle. Cap how many can exist; keep some warm for instant use.
Acquire timeout: how long to wait for a connection when the pool is full. Fail fast rather than block forever.
Idle and lifetime timeouts: close connections that have been idle too long (server may have closed them); recycle connections periodically (DNS changes, load balancer rotation).
Health check on borrow: validate the connection works before handing it out. Cheap (a SELECT 1 query); catches dead connections before they fail user requests.
Leak detection: log a stack trace when a connection has been held longer than expected. Catches forgotten close()/return() at development time.
Sizing
Pool size is the most-asked, most-misunderstood question. The answer is Little's Law:
pool_size = throughput × average_latency
An app that runs 100 database queries per second at 50ms average latency needs 5 in-flight at a time. Add 2x headroom for bursts. So pool of 10.
The wrong way to size: "we have 200 threads, set pool size to 200." This wildly oversizes the pool and overwhelms the downstream. Most threads at any moment are not in the database; a connection per thread isn't needed.
The other constraint: per-app-instance pool size × number of instances must fit within the downstream's connection limit. Postgres default max_connections is 100; with 10 instances and pool of 20 each, that totals 200 → over the limit. Either tune the database or put a connection multiplexer (PgBouncer for Postgres) in front.
What goes wrong
Pool exhaustion. Pool is at max, every request blocks waiting for a connection, eventually times out. Causes: pool too small, OR connection leaks (acquired but never released). Diagnose with metrics (in-use count) and leak detection.
Connection leaks. Code that opens a connection without closing on the error path. The fix is try-with-resources / context manager / defer close(). Linters catch many of these.
Stale connections. Server-side closed the connection; pool still thinks it's good; next borrow fails. Health check on borrow plus a max-lifetime fix this.
Long-held connections. One slow query holds a connection for minutes; pool exhausted by that one request. Set query timeouts; long-running operations should use a separate pool or move out of the request path.
When a pool is unnecessary
Standalone CLI scripts that do one batch of work. Test fixtures. Single-instance services with very low traffic. For everything else, pool.
Beyond databases
The same pattern applies to HTTP clients (gRPC channels, OkHttp, axios), message brokers (Kafka producers, NATS clients), even AWS SDK calls. Most modern client libraries pool connections internally; some don't. Check the docs and configure the pool size; the defaults are often "1 connection" which is wrong for any real load.
Implementations
HikariCP is the standard JDBC pool. Most config is sane by default. Two settings are mandatory: maximumPoolSize (sized to the database's connection limit divided by app instances) and connectionTimeout (how long to wait for a connection before failing).
1 HikariConfig config = new HikariConfig();
2 config.setJdbcUrl("jdbc:postgresql://db:5432/app");
3 config.setUsername(user);
4 config.setPassword(pass);
5 config.setMaximumPoolSize(20); // most important setting
6 config.setMinimumIdle(5);
7 config.setConnectionTimeout(2_000); // ms; fail fast if pool full
8 config.setIdleTimeout(60_000);
9 config.setMaxLifetime(30 * 60_000); // recreate connections periodically
10 config.setLeakDetectionThreshold(10_000); // log if held >10s
11
12 HikariDataSource ds = new HikariDataSource(config);
13
14 try (Connection c = ds.getConnection()) { // borrow
15 // ... use c ...
16 } // auto-returns to poolPool exhausted exception. Two causes: pool too small for traffic, OR connections leaked (acquired but never released). HikariCP's leakDetectionThreshold logs a stack trace when a connection is held too long, pointing at the leak.
1 // Symptom: SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out
2
3 // Diagnosis steps:
4 // 1. Check pool metrics: in-use connections vs max. If always at max, pool too small.
5 // 2. Enable leak detection (10s threshold). Check logs for stack traces.
6 // 3. Look for try-with-resources omissions or async paths that don't return connections.
7
8 // Common bug:
9 Connection c = ds.getConnection(); // BAD: no try-with-resources
10 doStuff(c); // exception here -> connection leaked
11 c.close(); // never reached
12
13 // Fix:
14 try (Connection c = ds.getConnection()) { // GOOD: auto-close
15 doStuff(c);
16 }Key points
- •Connection establishment is expensive: TCP handshake, TLS handshake, auth. Pool amortises this.
- •Pool size = max concurrent in-flight requests. Sized by downstream limit, not by app traffic.
- •Acquire blocks (or fails) when pool is empty; this is the back-pressure mechanism.
- •Health check: validate connection on borrow; broken connections must be discarded, not handed out.
- •Idle timeout: connections that have been idle too long are closed (server may have closed them already).
Follow-up questions
▸How big should the connection pool be?
▸What is the relationship between pool size and database max_connections?
▸Should there be separate pools for read replicas and writers?
▸Why recycle connections (maxLifetime)?
Gotchas
- !Pool size larger than database can handle: every request fails when DB max_connections hit
- !Forgetting to release connection (no try-with-resources): pool leak, eventual exhaustion
- !No idle timeout: server-closed connections stay in the pool, fail on next borrow
- !Long-running query holds a connection: pool exhausted by 1 slow query
- !Same pool for read and write: write spike starves reads, or vice versa