IP Addressing & Subnetting
CIDR notation carves the IP space into right-sized subnets — plan generously, document religiously, and never overlap ranges.
The Problem
Every device on a network needs a unique address, and there must be a way to group addresses into manageable chunks. Poor IP planning leads to address conflicts, wasted space, impossible VPC peering, and painful migrations. Getting subnetting right upfront saves months of rework later.
Mental Model
Like a zip code system. The country is the /8, the state is the /16, the city is the /24, and the street address is the host portion. Routers only need to know how to reach the zip code — they do not need to know every individual address.
Architecture Diagram
How It Works
Every device that communicates over IP needs a unique address. In IPv4, that is a 32-bit number usually written as four octets: 192.168.1.100. But raw IP addresses are useless without a way to organize them into networks and subnets. That is where subnetting and CIDR come in.
CIDR Notation
CIDR (Classless Inter-Domain Routing) uses a slash followed by a number to indicate how many bits of the address represent the network. The remaining bits represent individual hosts.
10.0.0.0/16
├── Network bits: 10.0 (first 16 bits) — identifies the network
└── Host bits: 0.0 (last 16 bits) — identifies hosts within the network
→ 2^16 = 65,536 possible addresses
The math is straightforward:
| CIDR | Subnet Mask | Total IPs | Usable Hosts (AWS) | Typical Use |
|---|---|---|---|---|
| /8 | 255.0.0.0 | 16,777,216 | 16,777,211 | Enterprise backbone |
| /16 | 255.255.0.0 | 65,536 | 65,531 | VPC |
| /20 | 255.255.240.0 | 4,096 | 4,091 | Large app subnet |
| /24 | 255.255.255.0 | 256 | 251 | Standard subnet |
| /28 | 255.255.255.240 | 16 | 11 | Small utility subnet |
On AWS, 5 IPs per subnet are reserved: network address, VPC router, DNS server, future use, and broadcast. GCP reserves 4. Azure reserves 5. Always account for this.
# Quick CIDR math from the command line
ipcalc 10.0.0.0/20
# Network: 10.0.0.0/20
# Broadcast: 10.0.15.255
# HostMin: 10.0.0.1
# HostMax: 10.0.15.254
# Hosts/Net: 4094
# Check if two CIDRs overlap
python3 -c "
import ipaddress
a = ipaddress.ip_network('10.0.0.0/20')
b = ipaddress.ip_network('10.0.8.0/21')
print(f'Overlap: {a.overlaps(b)}')
"
# Overlap: True
Private Address Ranges (RFC 1918)
Three ranges are reserved for private use — they are not routable on the public internet:
| Range | CIDR | Total IPs | Common Use |
|---|---|---|---|
| 10.0.0.0 – 10.255.255.255 | 10.0.0.0/8 | 16.7M | Cloud VPCs, large enterprises |
| 172.16.0.0 – 172.31.255.255 | 172.16.0.0/12 | 1.05M | Docker default bridge, some clouds |
| 192.168.0.0 – 192.168.255.255 | 192.168.0.0/16 | 65K | Home networks, small offices |
Devices using private IPs reach the internet through NAT (Network Address Translation), which maps the private address to a public one. This is why a laptop has a 192.168.x.x address but websites see the ISP's public IP.
IPv4 Exhaustion and IPv6
IPv4 has 4.3 billion addresses. Sounds like a lot until considering there are 15+ billion connected devices. The global pool of unallocated IPv4 addresses has been exhausted since 2011. The workarounds keeping IPv4 alive:
- NAT: Multiple devices share a single public IP (every home router does this)
- CGNAT: ISPs put another layer of NAT between the customer and the internet
- IPv4 market: Companies buy and sell /16 blocks for $400K–$1M+
IPv6 solves this with 128-bit addresses (340 undecillion — more than enough to give every atom on Earth its own address). An IPv6 address looks like 2001:0db8:85a3:0000:0000:8a2e:0370:7334, or shortened: 2001:db8:85a3::8a2e:370:7334.
# Check IPv6 connectivity
curl -6 https://ipv6.google.com
# See both IPv4 and IPv6 addresses on all interfaces
ip -4 addr show
ip -6 addr show
Production VPC Architecture
Here is how a well-designed AWS VPC looks in practice. This is the pattern used by most mid-to-large companies:
The Three-Tier Subnet Model
Public Subnets (/24): Hold resources that need direct internet access — Application Load Balancers, NAT Gateways, and bastion hosts. These subnets have a route to an Internet Gateway.
Private Subnets (/20): The workhorse subnets. Application servers, ECS tasks, EKS pods, and Lambda ENIs live here. These subnets route internet-bound traffic through a NAT Gateway in the public subnet. Use /20 or larger — Kubernetes clusters consume IPs aggressively (one IP per pod on AWS EKS with VPC CNI).
Data Subnets (/24): RDS instances, ElastiCache clusters, and other data stores. These have no internet route at all — not even through NAT. The only inbound traffic comes from the private subnets via security groups.
# Terraform example — VPC with properly sized subnets
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_support = true
enable_dns_hostnames = true
}
resource "aws_subnet" "private" {
count = 3
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet("10.0.0.0/16", 4, count.index) # /20 subnets
availability_zone = data.aws_availability_zones.available.names[count.index]
}
IP Planning for Multi-Account Architectures
With multiple AWS accounts (and there should be separate accounts for prod, staging, dev, security, shared-services), plan CIDR ranges so they never overlap:
| Account | VPC CIDR | Purpose |
|---|---|---|
| Production | 10.0.0.0/16 | Prod workloads |
| Staging | 10.1.0.0/16 | Pre-prod testing |
| Development | 10.2.0.0/16 | Dev environments |
| Shared Services | 10.10.0.0/16 | CI/CD, monitoring, DNS |
| Security | 10.20.0.0/16 | WAF, IDS, log aggregation |
These ranges do not overlap, so they can all be peered or connected via Transit Gateway. Starting with overlapping ranges eventually hits a wall where VPC peering or Transit Gateway will not work without adding NAT in the middle — which is an operational nightmare.
Real-World Impact
The Kubernetes IP Problem
AWS EKS with the default VPC CNI plugin assigns a real VPC IP to every pod. A cluster with 50 nodes running 30 pods each needs 1,500 IPs just for pods — plus IPs for the nodes themselves, load balancers, and ENIs. A /24 subnet (254 usable IPs) is woefully insufficient.
This is why EKS-heavy architectures use /18 or /19 subnets for worker nodes, or switch to an overlay CNI like Calico that uses a separate IP space for pods.
# Check how many IPs are left in the subnets
aws ec2 describe-subnets \
--filters "Name=vpc-id,Values=vpc-abc123" \
--query "Subnets[*].{ID:SubnetId,AZ:AvailabilityZone,CIDR:CidrBlock,Available:AvailableIpAddressCount}" \
--output table
Subnet Sizing Cheat Sheet
My rule of thumb for AWS VPC subnet sizing:
- Public subnets: /24 (254 IPs) — ALBs and NAT Gateways do not need many IPs
- Private app subnets: /19 (8,190 IPs) if running EKS, /20 (4,094 IPs) otherwise
- Data subnets: /24 (254 IPs) — databases are a known quantity
- Total VPC: /16 (65,536 IPs) — gives room to add subnets later without re-architecting
Always over-provision. The cost of unused IPs is zero. The cost of running out of IPs in production is a re-architecture project that takes weeks.
Key Points
- •CIDR replaced classful addressing (Class A/B/C) in 1993. If someone talks about IP classes in production context, they are 30 years behind.
- •Three private ranges: 10.0.0.0/8 (16M addresses), 172.16.0.0/12 (1M addresses), 192.168.0.0/16 (65K addresses).
- •IPv4 addresses (4.3 billion) are exhausted. NAT and CIDR are the duct tape keeping IPv4 alive. IPv6 has 340 undecillion addresses.
- •In cloud VPCs, subnet sizing is the first architecture decision and the hardest to change later.
- •Always plan subnets with room to grow. A /24 gives 254 hosts — that feels huge until Kubernetes with 30 pods per node eats through them.
Key Components
| Component | Role |
|---|---|
| IP Address | A 32-bit (IPv4) or 128-bit (IPv6) number that uniquely identifies a device on a network |
| Subnet Mask | Determines which portion of the IP address is the network part vs the host part |
| CIDR Notation | Compact way to express subnets — /24 means the first 24 bits are the network, leaving 8 bits (256 addresses) for hosts |
| Default Gateway | The router IP that handles traffic destined for outside the local subnet |
| Private Address Ranges | Non-routable IP blocks (10.x, 172.16-31.x, 192.168.x) for internal networks, requiring NAT to reach the internet |
When to Use
Plan IP addressing before creating any cloud infrastructure. Define CIDR blocks for each environment (dev, staging, prod), region, availability zone, and tier. Document it. Share it. Treat it like a first-class architecture decision.
Tool Comparison
| Tool | Type | Best For | Scale |
|---|---|---|---|
| AWS VPC | Managed | Cloud-native subnetting with integrated routing, security groups, and NACLs | Small-Enterprise |
| GCP VPC | Managed | Global VPCs with automatic subnet creation per region | Medium-Enterprise |
| Azure VNet | Managed | Enterprise hybrid cloud networking with ExpressRoute integration | Large-Enterprise |
| ipcalc | Open Source | CLI tool for quick subnet calculation, CIDR math, and range validation | Small-Enterprise |
Debug Checklist
- Verify the IP is in the correct subnet: use 'ipcalc 10.0.1.50/24' to check network boundaries.
- Check for overlapping CIDR ranges: run 'aws ec2 describe-vpcs' and compare CIDR blocks before peering.
- Verify route tables: run 'ip route show' or check AWS route tables for the subnet.
- Check available IPs in a subnet: AWS Console shows remaining IPs, or use 'aws ec2 describe-subnets --query Subnets[].AvailableIpAddressCount'.
- Test reachability: 'ping -c 3 <target>' and 'traceroute -n <target>' to verify Layer 3 connectivity.
Common Mistakes
- Making VPC subnets too small. A /28 gives only 11 usable IPs on AWS (16 minus 5 reserved). Kubernetes clusters burn through IPs fast.
- Using overlapping CIDR ranges across VPCs. This makes VPC peering impossible without ugly NAT workarounds.
- Forgetting that AWS, GCP, and Azure each reserve 3-5 IPs per subnet for infrastructure (gateway, DNS, broadcast, etc.).
- Treating IPv6 as optional. Major cloud providers and mobile carriers now use IPv6 by default — services need to handle it.
- Not documenting the IP address plan. Six months later nobody remembers which /16 was assigned to production vs staging.
Real World Usage
- •AWS uses a VPC model where each region gets a /16 and subnets are typically /20 to /24 across availability zones.
- •Google's internal network uses a flat /8 address space with custom routing — they allocate IPs to containers, not machines.
- •Cloudflare's Anycast assigns the same IP to servers across 300+ cities, using BGP to route users to the nearest one.
- •Netflix VPC architecture uses separate subnets for each tier (web, app, data) with NACLs enforcing traffic flow between them.