NAT — Network Address Translation
NAT lets thousands of private IPs hide behind one public IP using port mapping — essential for IPv4 survival but a headache for peer-to-peer.
The Problem
With only 4.3 billion IPv4 addresses and 15+ billion connected devices, we need a way for multiple devices to share a single public IP. NAT solves this by rewriting addresses on the fly — but it also breaks direct device-to-device communication, creating challenges for P2P applications.
Mental Model
Like a hotel front desk forwarding calls. External callers dial the hotel's main number (public IP). The front desk (NAT) looks up the room number (internal IP) and forwards the call. Outgoing calls show the hotel's number on caller ID, not the room's. But if someone outside tries to call a room directly — they cannot, because room numbers are internal only.
Architecture Diagram
How It Works
NAT rewrites IP addresses (and often port numbers) in packet headers as they pass through a router. The most common form — PAT (Port Address Translation) — allows thousands of devices to share a single public IP by assigning each connection a unique source port on the public side.
The NAT Translation Process
Here is what happens when a device at 192.168.1.10 opens a connection to 93.184.216.34:443 (example.com):
- Outgoing packet: Source
192.168.1.10:5000→ Destination93.184.216.34:443 - NAT rewrites source: Source
203.0.113.1:40001→ Destination93.184.216.34:443 - NAT records the mapping:
192.168.1.10:5000 ↔ 203.0.113.1:40001 → 93.184.216.34:443 - Response arrives: Source
93.184.216.34:443→ Destination203.0.113.1:40001 - NAT rewrites destination: Source
93.184.216.34:443→ Destination192.168.1.10:5000
The external server never sees the private IP. It thinks it is talking to 203.0.113.1.
# View active NAT translations on a Linux router
sudo conntrack -L | head -20
# tcp 6 431998 ESTABLISHED src=192.168.1.10 dst=93.184.216.34 sport=5000 dport=443
# src=93.184.216.34 dst=203.0.113.1 sport=443 dport=40001 [ASSURED]
# Set up basic NAT with iptables (what Docker does internally)
sudo iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -o eth0 -j MASQUERADE
NAT Types and P2P Implications
Not all NATs behave the same. The type determines whether peer-to-peer connections (WebRTC, VoIP, gaming) can work:
| NAT Type | Behavior | P2P Friendly? |
|---|---|---|
| Full Cone | Any external host can send to the mapped port | Yes — easy hole punching |
| Address Restricted | Only the destination IP can send back | Mostly — needs STUN |
| Port Restricted | Only the exact destination IP:port can send back | Harder — needs STUN |
| Symmetric | Different mapping for each destination | Very hard — often needs TURN relay |
The challenge with P2P is that both sides are behind NAT. Neither can accept unsolicited inbound connections. The solution is hole punching:
- Both clients contact a public STUN server to discover their public IP:port mapping
- The STUN server tells each client the other's public IP:port
- Both clients simultaneously send packets to each other's public address
- The NAT sees outgoing traffic and creates a mapping
- When the other side's packet arrives, it matches the mapping and gets forwarded
This works with Full Cone, Address Restricted, and Port Restricted NAT. With Symmetric NAT, the mapping changes per destination, so the port the STUN server sees is different from what the peer would need to use. In that case, the fallback is a TURN relay — a server that both sides connect to and that forwards traffic between them.
# Test the NAT type using a STUN client
stunclient stun.l.google.com:19302
# Binding test: success
# Local address: 192.168.1.10:54321
# Mapped address: 203.0.113.1:40001
NAT in Cloud Environments
Cloud NAT is conceptually the same but operationally different. AWS NAT Gateway, for example:
- Managed service — no EC2 instance to patch or monitor
- Scales automatically up to 55,000 simultaneous connections per destination
- Costs money — $0.045/hour ($32.40/month) + $0.045 per GB processed
- AZ-scoped — lives in a single Availability Zone, so deploy one per AZ for HA
# Check which public IP the instance NATs through
curl -s ifconfig.me
# 203.0.113.50 (should be the NAT Gateway's Elastic IP)
# Monitor NAT Gateway port allocation on AWS (via CloudWatch)
# Metric: ErrorPortAllocation — non-zero means port exhaustion
Production Considerations
Port Exhaustion
A single public IP provides 65,535 ports per protocol. With PAT, every outgoing connection uses a port. High-throughput services that make thousands of concurrent outbound connections (API gateways, crawlers, log shippers) can exhaust this pool.
Symptoms: Intermittent connection failures, "cannot assign requested address" errors, ErrorPortAllocation CloudWatch alarms on AWS NAT Gateway.
Solutions:
- Add more public IPs to the NAT device (AWS NAT Gateway supports up to 8 Elastic IPs, giving 8 x 65,535 = 524K ports)
- Reduce connection count with connection pooling and HTTP keep-alive
- Use multiple NAT Gateways and split traffic across them via routing
NAT Gateway Cost Optimization
AWS NAT Gateway data processing charges can be surprisingly expensive. Common cost traps:
| Traffic Pattern | Monthly Cost at 1 TB | Fix |
|---|---|---|
| S3 access via NAT | ~$45 | Use VPC Gateway Endpoint (free) |
| DynamoDB access via NAT | ~$45 | Use VPC Gateway Endpoint (free) |
| ECR image pulls via NAT | ~$45 | Use VPC Interface Endpoint (~$7) |
| Cross-AZ via NAT | ~$10/TB | Route within AZ or use PrivateLink |
# Check NAT Gateway data processing in AWS Cost Explorer
aws ce get-cost-and-usage \
--time-period Start=2025-01-01,End=2025-01-31 \
--granularity MONTHLY \
--metrics BlendedCost \
--filter '{"Dimensions":{"Key":"USAGE_TYPE","Values":["NatGateway-Bytes"]}}'
The biggest NAT Gateway cost savings comes from VPC endpoints. S3 and DynamoDB Gateway Endpoints are free and handle the most common traffic patterns. For a Kubernetes cluster pulling container images, an ECR Interface Endpoint pays for itself almost immediately.
NAT and WebRTC: A Practical Example
WebRTC — used by Google Meet, Zoom, Discord, and every browser-based video call — relies on ICE (Interactive Connectivity Establishment) to traverse NAT:
- Gather candidates: The client collects its local IPs (host candidates), queries STUN for its public IP (server-reflexive candidates), and allocates a TURN relay (relay candidates)
- Exchange candidates: Both peers share their candidate lists through a signaling server
- Connectivity checks: Both sides try every combination of candidates simultaneously
- Select best path: ICE picks the pair with the lowest latency that works — preferring direct over relayed
In practice, about 85% of WebRTC connections succeed with STUN alone (direct P2P through NAT). The remaining 15% — mostly corporate networks with symmetric NAT or restrictive firewalls — require a TURN relay, which adds latency and server cost.
Connection success rates by NAT type:
Full Cone + Full Cone: ~100% direct
Restricted + Restricted: ~95% direct
Port Restricted + Port Restricted: ~80% direct
Symmetric + Any: ~15% direct (rest needs TURN)
This is why every production WebRTC deployment needs TURN servers as a fallback. Running TURN is not optional — without it, 10-15% of users simply cannot connect.
Key Points
- •NAT is the reason the internet still works on IPv4. Without it, we would have run out of addresses in the late 1990s.
- •PAT (overloaded NAT) maps thousands of internal connections to a single public IP using different source ports.
- •NAT breaks the end-to-end principle of IP — devices behind NAT cannot receive unsolicited inbound connections.
- •NAT type (Full Cone, Restricted, Symmetric) determines whether P2P protocols like WebRTC can establish direct connections.
- •Cloud NAT Gateways (AWS NAT GW, GCP Cloud NAT) cost real money — $0.045/hr + $0.045/GB on AWS. Optimize traffic to reduce costs.
Key Components
| Component | Role |
|---|---|
| NAT Table | Maps internal IP:port pairs to external IP:port pairs, tracking active translations |
| Source NAT (SNAT) | Rewrites the source IP of outgoing packets from private to public, used for internet access |
| Destination NAT (DNAT) | Rewrites the destination IP of incoming packets from public to private, used for port forwarding |
| Port Address Translation (PAT) | Multiple internal IPs share a single public IP by using different source ports — the most common NAT type |
| NAT Gateway (Cloud) | Managed cloud service providing SNAT for private subnets to reach the internet without exposing instances |
When to Use
NAT is used whenever private IP addresses need to communicate with the public internet. In cloud environments, deploy NAT Gateways for private subnets that need outbound access. For P2P applications, implement STUN/TURN/ICE to traverse NAT.
Tool Comparison
| Tool | Type | Best For | Scale |
|---|---|---|---|
| AWS NAT Gateway | Managed | Production-grade managed NAT with automatic scaling and HA within an AZ | Medium-Enterprise |
| iptables / nftables | Open Source | Self-managed NAT on Linux — full control, no per-GB cost, but HA is on the operator | Small-Enterprise |
| GCP Cloud NAT | Managed | Distributed NAT that scales per-VM without a single gateway instance | Medium-Enterprise |
| fck-nat | Open Source | EC2-based NAT instance at 1/10th the cost of AWS NAT Gateway for dev/staging | Small-Enterprise |
Debug Checklist
- Check NAT table entries: on Linux, run 'conntrack -L' or 'cat /proc/net/nf_conntrack' to see active translations.
- Verify outbound connectivity from private subnet: 'curl -s ifconfig.me' from an instance should return the NAT Gateway's public IP.
- Check for port exhaustion: monitor 'ErrorPortAllocation' CloudWatch metric on AWS NAT Gateway.
- Test NAT type: use a STUN client like 'stunclient stun.l.google.com:19302' to determine the NAT behavior.
- Verify NAT Gateway route: check that the private subnet's route table has 0.0.0.0/0 pointing to the NAT Gateway.
Common Mistakes
- Assuming NAT provides security. NAT hides internal IPs but is not a firewall. It does not inspect or filter traffic.
- Running out of NAT ports. A single NAT device has 65,535 ports per protocol per public IP. High-connection services hit this limit.
- Forgetting NAT Gateway costs. An AWS NAT Gateway processing 1TB/month costs ~$90 — and that adds up across multiple AZs.
- Not understanding NAT type implications for real-time apps. Symmetric NAT makes WebRTC hole punching nearly impossible.
- Using a single NAT Gateway across multiple AZs. If that AZ goes down, all private subnets lose internet access.
Real World Usage
- •Every home router performs PAT, translating dozens of devices behind a single public IP from the ISP.
- •AWS NAT Gateway is the standard for private subnets needing outbound internet access — used by virtually every production VPC.
- •WebRTC (used by Google Meet, Zoom, Discord) uses STUN/TURN to traverse NAT for peer-to-peer audio/video.
- •Gaming companies like Valve and Riot spend significant engineering effort on NAT traversal for low-latency multiplayer.
- •Docker uses NAT (via iptables) to give containers internet access through the host's network interface.