API Performance Metrics

Why Averages Lie

If your API has an average latency of 200ms, that sounds fine. But averages are dangerous. If 95% of requests complete in 100ms and 5% take 2,100ms, your average is still 200ms. Those 5% of users are having a terrible experience, and you'd never know from the average.

This is why percentile-based measurement is standard practice. Track P50 (median, the typical experience), P95 (the experience of 1 in 20 users), and P99 (the experience of 1 in 100 users). For high-traffic APIs, P99.9 matters too. At 10 million requests per day, P99.9 still means 10,000 requests hitting that worst-case latency.

Defining Your SLIs

A good API SLI combines latency and availability into something measurable. A common pattern:

Latency SLI: proportion of requests where P95 latency < 300ms, measured over a rolling 5-minute window
Error rate SLI: proportion of requests returning non-5xx responses, measured over a rolling 5-minute window

Set your SLO as a target for each SLI. For example: "99.9% of 5-minute windows will have P95 latency under 300ms." That translates to roughly 43 minutes of budget per month where you're allowed to miss. The error budget model from SRE practice makes this actionable: when budget runs low, freeze feature releases and focus on reliability.

Performance Budgets Across the Call Chain

An API that calls three downstream services needs a latency budget for each hop. If your edge SLO is 500ms P95, and you have a gateway (20ms), an application server (processing + DB query), and an external API call, you need to allocate that 500ms across the chain.

Instrument each segment with distributed tracing (OpenTelemetry, Jaeger, Datadog APM). When total latency exceeds budget, trace data tells you which segment is responsible. Without this breakdown, debugging latency regressions becomes guesswork.

Real User Monitoring vs Synthetic

Synthetic monitoring (scheduled probes from fixed locations) gives you consistent baselines and catches outages fast. But it misses what real users experience. A user on a mobile connection in Mumbai has a very different experience than a synthetic check running from the same AWS region as your servers.

Real User Monitoring (RUM) captures actual user-side timings. It reveals geographic latency variance, device performance differences, and the real impact of network conditions. The tradeoff is data volume and noise. Use synthetic for alerting and SLO tracking. Use RUM for understanding the full distribution of user experience and identifying where to invest in performance improvements.

Apdex as a Communication Tool

Apdex (Application Performance Index) converts latency into a 0-to-1 score. You define a target threshold T (say, 300ms). Requests under T are "satisfied," requests between T and 4T are "tolerating," requests over 4T are "frustrated." The formula: (satisfied + tolerating/2) / total.

An Apdex of 0.95 is excellent. Below 0.85 and users are noticing. Below 0.70 and you have a real problem. The value of Apdex is that it's a single number you can put on an executive dashboard without explaining percentiles. Engineers should still look at the underlying percentile data for debugging, but Apdex bridges the gap to non-technical stakeholders.

Why Averages Lie

Defining Your SLIs

A good API SLI combines latency and availability into something measurable. A common pattern:

Latency SLI: proportion of requests where P95 latency < 300ms, measured over a rolling 5-minute window

Error rate SLI: proportion of requests returning non-5xx responses, measured over a rolling 5-minute window

Performance Budgets Across the Call Chain

Real User Monitoring vs Synthetic

Apdex as a Communication Tool

Why Averages Lie

Defining Your SLIs

Performance Budgets Across the Call Chain

Real User Monitoring vs Synthetic

Apdex as a Communication Tool

Key Points

Common Mistakes

Related Topics

API Performance Metrics

Why Averages Lie

Defining Your SLIs

Performance Budgets Across the Call Chain

Real User Monitoring vs Synthetic

Apdex as a Communication Tool

Key Points

Common Mistakes

Related Topics