gRPC & Protocol Buffers
gRPC combines protobuf's compact binary serialization with HTTP/2's multiplexing for fast, strongly-typed service communication.
The Problem
REST over JSON is flexible but loosely typed, verbose, and slow to serialize. How does a strongly-typed, high-performance RPC framework work across languages while supporting streaming?
Mental Model
Like a formal contract between services - both sides agree on the exact format, no room for ambiguity
Architecture Diagram
How It Works
gRPC is a Remote Procedure Call framework that makes calling a method on a remote service feel like calling a local function. The service interface is defined in a .proto file, code generation runs, and the output is strongly-typed client stubs and server interfaces in the target language. The runtime handles serialization (protobuf), transport (HTTP/2), and cross-cutting concerns (deadlines, metadata, interceptors).
The key insight: by agreeing on a schema at build time, gRPC eliminates an entire class of bugs that plague REST APIs — mismatched field names, wrong types, missing fields, undocumented APIs. The .proto file IS the documentation, the contract, and the code.
The Proto File — The API Contract
Everything starts with a .proto file:
syntax = "proto3";
package payment.v1;
service PaymentService {
// Unary: single request, single response
rpc ProcessPayment(PaymentRequest) returns (PaymentResponse);
// Server streaming: single request, stream of responses
rpc WatchTransactions(WatchRequest) returns (stream Transaction);
// Client streaming: stream of requests, single response
rpc BatchUpload(stream PaymentRecord) returns (BatchResult);
// Bidirectional streaming: both sides stream
rpc LiveReconciliation(stream ReconcileRequest) returns (stream ReconcileResponse);
}
message PaymentRequest {
string idempotency_key = 1; // Field numbers, not values
int64 amount_cents = 2;
string currency = 3;
string merchant_id = 4;
PaymentMethod method = 5;
}
message PaymentResponse {
string transaction_id = 1;
PaymentStatus status = 2;
google.protobuf.Timestamp processed_at = 3;
}
enum PaymentStatus {
PAYMENT_STATUS_UNSPECIFIED = 0;
PAYMENT_STATUS_PENDING = 1;
PAYMENT_STATUS_COMPLETED = 2;
PAYMENT_STATUS_FAILED = 3;
}
enum PaymentMethod {
PAYMENT_METHOD_UNSPECIFIED = 0;
PAYMENT_METHOD_CARD = 1;
PAYMENT_METHOD_ACH = 2;
}
Run protoc (or buf generate), and the output is typed client and server code in Go, Java, Python, Rust, C++, or any supported language. The field numbers (1, 2, 3) are the wire format identifiers — never reuse or change them.
Four Streaming Modes
gRPC's streaming capabilities over HTTP/2 are what set it apart from REST:
Unary (Request-Response): The familiar pattern. Client sends one request, server sends one response. Use for most CRUD operations.
Server Streaming: Client sends one request, server sends a stream of responses. Perfect for real-time feeds, log tailing, or paginating large result sets without repeated requests.
// Server streaming example — watching price updates
stream, err := client.WatchPrices(ctx, &WatchRequest{Symbols: []string{"AAPL", "GOOG"}})
for {
price, err := stream.Recv()
if err == io.EOF { break }
fmt.Printf("%s: $%.2f\n", price.Symbol, price.Price)
}
Client Streaming: Client sends a stream of messages, server responds once when done. Use for batch uploads, sensor data ingestion, or any scenario where the client has many items to send.
Bidirectional Streaming: Both sides send streams concurrently. Each side can read and write independently. Use for chat, collaborative editing, or real-time synchronization.
Deadlines and Cancellation — The Killer Feature Nobody Talks About
REST APIs have no built-in timeout propagation. If Service A calls B with a 5-second timeout, and B calls C, C has no idea that A is running out of patience. gRPC fixes this.
When a deadline is set on a gRPC call, it propagates automatically through the entire call chain:
Service A (deadline: 5s)
→ Service B (remaining: 4.2s after B's processing)
→ Service C (remaining: 3.1s)
→ Database (remaining: 2.5s)
If Service A's deadline expires, the cancellation propagates backward. Service C stops its database query. Service B stops waiting for C. No wasted work, no zombie requests consuming resources.
// Setting a deadline
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
resp, err := client.ProcessPayment(ctx, req)
if err != nil {
st, _ := status.FromError(err)
if st.Code() == codes.DeadlineExceeded {
// Handle timeout — the entire call chain was cancelled
}
}
Interceptors — Middleware Done Right
gRPC interceptors are the equivalent of HTTP middleware, but typed and composable:
// Unary server interceptor for logging
func loggingInterceptor(
ctx context.Context,
req interface{},
info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler,
) (interface{}, error) {
start := time.Now()
resp, err := handler(ctx, req)
log.Printf("method=%s duration=%s error=%v",
info.FullMethod, time.Since(start), err)
return resp, err
}
// Chain multiple interceptors
server := grpc.NewServer(
grpc.ChainUnaryInterceptor(
loggingInterceptor,
authInterceptor,
metricsInterceptor,
rateLimitInterceptor,
),
)
Common interceptor patterns: authentication (validate tokens from metadata), logging, metrics (Prometheus histograms for RPC duration), rate limiting, retry logic, and distributed tracing.
Protobuf Wire Format — Why It's Fast
Protobuf's binary format is the reason gRPC outperforms JSON-based APIs:
JSON: {"amount_cents": 9999, "currency": "USD"}
→ 43 bytes, requires parsing string keys, type coercion
Protobuf: [field 2, varint] [9999] [field 3, length-delimited] [3] [USD]
→ 11 bytes, zero-copy field access, no parsing ambiguity
Key design decisions that make protobuf fast:
- Field numbers instead of string keys — 1-2 bytes vs 10-20 bytes per field
- Varints for integers — small numbers use fewer bytes
- No self-description — the schema is known at compile time, so the wire format omits field names and types
- Zero-copy parsing is possible — the data can be read directly from the buffer
Schema Evolution — The Rules
Backward and forward compatibility are critical for evolving APIs without downtime:
Safe changes:
- Adding new fields (with new field numbers)
- Removing fields (but never reuse their field numbers)
- Renaming fields (wire format uses numbers, not names)
Breaking changes:
- Changing a field's type
- Reusing a field number
- Changing a field from singular to repeated (or vice versa)
// Version 1
message User {
string name = 1;
string email = 2;
}
// Version 2 — backward compatible
message User {
string name = 1;
string email = 2;
string phone = 3; // New field — old clients ignore it
repeated string roles = 4; // New field
reserved 5, 6; // Reserve numbers for future use
reserved "old_field"; // Reserve names to prevent reuse
}
Use buf breaking to automatically detect breaking changes in CI. This is non-negotiable for any production gRPC service.
When gRPC Is Not the Answer
gRPC is outstanding for internal service communication, but it has real limitations:
- Browser support — browsers can't make native gRPC calls. gRPC-Web with an Envoy proxy is required, which adds complexity.
- Human readability — curling a gRPC service isn't straightforward (though grpcurl helps). Debugging requires specialized tools.
- Caching — HTTP caching doesn't work naturally with gRPC's POST-based transport. Application-level caching is required.
- Small teams — the protobuf toolchain adds build complexity that small teams may not want.
The pragmatic answer: use gRPC for internal service-to-service communication where performance and type safety justify the toolchain overhead. Use REST or GraphQL for public-facing APIs where developer experience and browser compatibility matter.
Key Points
- •Protobuf is 3-10x smaller and 20-100x faster to parse than JSON, making gRPC ideal for high-throughput internal communication
- •gRPC supports four streaming modes: unary, server streaming, client streaming, and bidirectional streaming
- •Deadlines propagate across services — if Service A gives Service B a 5s deadline, B's call to C carries the remaining time
- •Code generation from .proto files ensures client and server always agree on the contract — no runtime surprises
- •gRPC reflection allows runtime schema discovery, enabling tools like grpcurl to work without compiled protos
Key Components
| Component | Role |
|---|---|
| Protocol Buffers (Protobuf) | Language-neutral, binary serialization format that defines message schemas in .proto files |
| Service Definition | Declares RPC methods with typed request/response messages, generating client stubs and server interfaces |
| HTTP/2 Transport | Provides multiplexed streams for concurrent RPCs and enables bidirectional streaming |
| Interceptors | Middleware chain for cross-cutting concerns like logging, auth, metrics, and retries |
| Deadlines & Cancellation | Built-in timeout propagation across service boundaries — no more requests running forever |
When to Use
Use gRPC for internal service-to-service communication where performance and type safety matter. It excels in polyglot microservice architectures, real-time streaming, and any scenario requiring strongly-typed contracts. Avoid using it for public-facing browser APIs without gRPC-Web.
Tool Comparison
| Tool | Type | Best For | Scale |
|---|---|---|---|
| gRPC | Open Source | High-performance, strongly-typed service-to-service communication with streaming | Google-scale internal infrastructure |
| Apache Thrift | Open Source | Cross-language RPC with multiple transport and protocol options | Facebook's internal services |
| Apache Avro RPC | Open Source | Schema-evolution-friendly RPC, especially in data pipeline ecosystems | Hadoop and Kafka ecosystems |
| Cap'n Proto | Open Source | Zero-copy serialization for maximum performance in latency-sensitive paths | Cloudflare Workers, specialized high-perf systems |
Debug Checklist
- Check gRPC status code — it's NOT the same as HTTP status codes (UNAVAILABLE, DEADLINE_EXCEEDED, etc.)
- Use grpcurl for ad-hoc testing — it's like curl for gRPC
- Enable gRPC channelz for connection-level debugging and stats
- Check deadline propagation — is the deadline set too tight for the call chain depth?
- Verify proto compatibility — use buf breaking to detect backward-incompatible changes
Common Mistakes
- Not setting deadlines — without them, a slow downstream service can hold connections open indefinitely
- Using gRPC for browser-facing APIs without gRPC-Web — browsers cannot make native HTTP/2 gRPC calls
- Breaking backward compatibility by changing proto field numbers instead of adding new fields
- Ignoring error details — gRPC status codes are richer than HTTP status codes, use google.rpc.Status for structured errors
- Sending large payloads (>4MB default) without increasing max message size or using streaming instead
Real World Usage
- •Google uses gRPC for nearly all internal service communication — billions of RPCs per second
- •Netflix uses gRPC between microservices, replacing their custom IPC framework
- •Square uses gRPC for mobile-to-backend communication with protobuf schemas shared across iOS, Android, and server
- •Uber's microservice architecture relies on gRPC for inter-service calls with deadline propagation
- •Kubernetes uses gRPC extensively — kubelet-to-API-server, CRI (Container Runtime Interface), and CSI all use gRPC