CrackingWalnuts

System DesignMay 25, 2025· 23 min read

Designing a Multi-Tenant, Globally Scalable Tagging System

Goal: Build a multi-tenant tagging platform that prevents duplicate tags globally, serves product discovery in under 25ms, and tracks tag usage across millions of clicks with zero data loss. Scale: 10K tenants, 200K concurrent users across three regions.

Tagging systems get complex fast when they need to work across multiple tenants and regions. This post breaks down the design into three consistency tiers, because not everything needs the same guarantees: strong consistency for tag operations, smart caching for discovery, and eventual consistency for analytics.

🎯 Functional Requirements

Requirement	Details
Multi-tenancy	Each tenant independently manages its tags and content
Global availability	Low latency access for users worldwide
Tag lifecycle	Create, update, delete tags
Content tagging	Associate and dissociate tags from content
Popular tags dashboard	Show most-used tags by lifetime view counts
Scale	Handle up to 10K concurrent global users

⚡ Non-Functional Requirements

Requirement	Target	Priority
Users distributed across multiple regions	US, EU, Asia	P0
Tag creation latency for all users	~100ms regardless of location	P0
Prevent duplicate tag creation globally	Same tag name = same tag ID	P0
All users see same tag state after creation	Strong consistency	P0
Tag creation and content-tag mapping consistency	ACID across regions	P0
View count metrics consistency	Eventual (~2 min delay acceptable)	P1
Resilience to infrastructure failures	Auto-failover, zero data loss	P1

📊 Back-of-the-Envelope Calculations

User Base Assumptions

Total tenants: 10,000
Average users per tenant: 200
Total user base: 2,000,000 users
Concurrent users at peak: 200,000 (10% of total users)
Geographic distribution: 40% Americas, 30% Europe, 30% Asia-Pacific

Tag Operations (Control Plane)

Tag Creation:
- Average tags created per tenant per day: 20
- Total daily tag creations: 10,000 tenants × 20 = 200,000
- Peak tag creation rate: 200,000 ÷ 8 business hours ÷ 3600 seconds × 5 peak factor = ~35 tags/second
- Average request size: 1 KB
- Tag create bandwidth: 35 KB/second
Item-Tag Mappings:
- Average mappings per tenant per day: 2,000
- Total daily mappings: 10,000 tenants × 2,000 = 20,000,000
- Peak mapping rate: 20,000,000 ÷ 24 hours ÷ 3600 seconds × 3 peak factor = ~700 mappings/second
- Average request size: 0.8 KB
- Item-tag mapping bandwidth: ~560 KB/second

View Events (Data Plane)

View Events Per User:
- Average views per active user per day: 100
- Total daily users: 10,000 tenants × 200 users = 2,000,000
- Total daily views: 2,000,000 users × 100 views = 200,000,000
- Peak view rate: 200,000,000 ÷ 24 hours ÷ 3600 seconds × 4 peak factor = ~9,300 views/second
- Average event size: 0.5 KB
- View event bandwidth: ~4,650 KB/second

API Request Breakdown

Read Operations (90% of traffic):
- Tag lookups: ~4,000/second (~400 KB/second)
- Item lookups: ~6,000/second (~1,200 KB/second)
- Popular tag queries: ~1,000/second (~300 KB/second)
- Tagged item lookups: ~2,000/second (~600 KB/second)
Write Operations (10% of traffic):
- Tag creations: ~35/second (~35 KB/second)
- Item-tag mappings: ~700/second (~560 KB/second)
- View events: ~9,300/second (~4,650 KB/second)

Total System Capacity

Peak Inbound Bandwidth: ~7,700 KB/second (≈ 62 Mbps)
Peak Outbound Bandwidth: ~23,200 KB/second (≈ 186 Mbps)
Daily Data Transfer: ~2.5 TB

Storage Requirements Growth (Per Year)

Average Record Sizes:

Tag record: ~350 bytes (including all metadata, timestamps, UUIDs)
Item-Tag mapping: ~260 bytes (relationship with metadata)
View event: ~145 bytes (analytics data)

Annual Storage Growth:

Tags: 200,000 tags/day × 365 days × 350 bytes = ~25.6 GB/year
Item-Tag Mappings: 20M mappings/day × 365 days × 260 bytes = ~1.9 TB/year
View Events: 200M views/day × 365 days × 145 bytes = ~10.6 TB/year
Total Annual Storage Growth: ~12.5 TB/year

🗄️ Database Design

Store Selection Rationale

Store	Role	Why This Choice
Spanner	Primary DB, source of truth	Global strong consistency via TrueTime, multi-master writes, no manual sharding. Only option that gives ACID across 3 regions without operational overhead.
Redis	Regional cache + click buffer	Sub-ms reads for 95% cache-hit rate. INCR for atomic click counting. Disposable: Spanner is the fallback.
Kafka	Regional event streaming	Decouples tag operations from ES indexing and analytics. Durable buffer if downstream is slow. Regional topics avoid cross-region latency.
Elasticsearch	Tag search + aggregation	Handles "find all items with tag X" across millions of records in ~18ms. Complex filters (date range, type). Regional clusters keep latency local.

Alternative Database Comparison

Database	Global Consistency	Multi-Region	Multi-Master Writes	Scalability	Operational Overhead	Decision
PostgreSQL	❌ Limited	❌ Complex setup	❌ Single master	⚠️ Vertical scaling	🔴 High	Rejected
MongoDB	❌ Eventual only	⚠️ Manual setup	⚠️ Limited (sharding)	✅ Good	⚠️ Medium	Rejected
DynamoDB	⚠️ Eventually consistent	✅ Native	✅ Global tables	✅ Excellent	✅ Low	Considered
CockroachDB	✅ Strong	✅ Native	✅ Multi-region writes	✅ Good	⚠️ Medium	Considered
Spanner	✅ Strong + External	✅ Native	✅ Multi-region writes	✅ Excellent	✅ Minimal	Selected

Strong consistency knocked out the eventually consistent options. The multi-master requirement knocked out single-master databases. Worth noting: CockroachDB is a solid open-source alternative if GCP lock-in worries you.

Schema

sql

-- Products table
CREATE TABLE products (
    id STRING(36) NOT NULL,
    name STRING(255) NOT NULL,
    description TEXT,
    tenant_id STRING(36) NOT NULL,
    status STRING(20) NOT NULL DEFAULT 'active',
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
    created_by STRING(36) NOT NULL,
    updated_by STRING(36) NOT NULL,
) PRIMARY KEY (id);

-- Item types table
CREATE TABLE item_types (
    id STRING(36) NOT NULL,
    name STRING(255) NOT NULL,
    product_id STRING(36) NOT NULL,
    tenant_id STRING(36) NOT NULL,
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
    created_by STRING(36) NOT NULL,
    updated_by STRING(36) NOT NULL,
) PRIMARY KEY (id),
CONSTRAINT fk_item_type_product FOREIGN KEY (product_id) REFERENCES products (id);

-- Items/Content table
CREATE TABLE items (
    id STRING(36) NOT NULL,
    item_type_id STRING(36) NOT NULL,
    product_id STRING(36) NOT NULL,
    tenant_id STRING(36) NOT NULL,
    external_id STRING(255) NOT NULL, -- External system reference
    name STRING(255) NOT NULL,
    description TEXT,
    content_type STRING(50) NOT NULL, -- 'document', 'image', 'video', etc.
    status STRING(20) NOT NULL DEFAULT 'active',
    created_region STRING(20) NOT NULL, -- Track creation region
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
    created_by STRING(36) NOT NULL,
    updated_by STRING(36) NOT NULL,
) PRIMARY KEY (id),
CONSTRAINT fk_item_type FOREIGN KEY (item_type_id) REFERENCES item_types (id),
CONSTRAINT fk_item_product FOREIGN KEY (product_id) REFERENCES products (id),
CONSTRAINT unique_external_item UNIQUE (tenant_id, external_id);

-- Tags table with global uniqueness per tenant
CREATE TABLE tags (
    id STRING(36) NOT NULL, -- Deterministic UUID based on tenant_id + normalized_name
    tenant_id STRING(36) NOT NULL,
    normalized_name STRING(255) NOT NULL, -- Lowercase, trimmed version
    display_name STRING(255) NOT NULL, -- Original user input
    description TEXT,
    status STRING(20) NOT NULL DEFAULT 'active',
    created_region STRING(20) NOT NULL,
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
    created_by STRING(36) NOT NULL,
    updated_by STRING(36) NOT NULL,
) PRIMARY KEY (id),
CONSTRAINT unique_tag_per_tenant UNIQUE (tenant_id, normalized_name);

-- Item-Tag mappings table with strong consistency
CREATE TABLE item_tags (
    id STRING(36) NOT NULL,
    tenant_id STRING(36) NOT NULL,
    tag_id STRING(36) NOT NULL,
    item_id STRING(36) NOT NULL,
    created_region STRING(20) NOT NULL,
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
    created_by STRING(36) NOT NULL,
    updated_by STRING(36) NOT NULL,
) PRIMARY KEY (id),
CONSTRAINT fk_item_tag_tag FOREIGN KEY (tag_id) REFERENCES tags (id),
CONSTRAINT fk_item_tag_item FOREIGN KEY (item_id) REFERENCES items (id),
CONSTRAINT unique_item_tag UNIQUE (tenant_id, item_id, tag_id);

-- Regional tag increment events for analytics
CREATE TABLE tag_increment_events (
    event_id STRING(36) NOT NULL,
    tenant_id STRING(36) NOT NULL,
    tag_id STRING(36) NOT NULL,
    region STRING(20) NOT NULL,
    increment_amount INT64 NOT NULL,
    event_timestamp TIMESTAMP NOT NULL,
    processed_globally BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP NOT NULL,
) PRIMARY KEY (event_id);

-- Global lifetime tag statistics
CREATE TABLE tag_lifetime_stats (
    tenant_id STRING(36) NOT NULL,
    tag_id STRING(36) NOT NULL,
    total_lifetime_views INT64 NOT NULL DEFAULT 0,
    last_aggregated_at TIMESTAMP NOT NULL,
    regional_breakdown JSON NOT NULL, -- {"us": 1234, "eu": 567, "asia": 890}
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
) PRIMARY KEY (tenant_id, tag_id),
CONSTRAINT fk_tag_stats_tag FOREIGN KEY (tag_id) REFERENCES tags (id);

-- Performance optimization indexes
CREATE INDEX idx_items_tenant_type ON items (tenant_id, item_type_id);
CREATE INDEX idx_item_tags_tenant_tag ON item_tags (tenant_id, tag_id);
CREATE INDEX idx_item_tags_tenant_item ON item_tags (tenant_id, item_id);
CREATE INDEX idx_tags_tenant_status ON tags (tenant_id, status);
CREATE INDEX idx_tag_events_processing ON tag_increment_events (processed_globally, created_at);

📚 RESTful API Design

Product API

HTTP Method	Endpoint	Description	Status Codes
GET	/v1/products	List all products (paginated)	200
GET	/v1/products/{productId}	Get a specific product	200, 404
POST	/v1/products	Create a new product	201, 400, 409
PUT	/v1/products/{productId}	Update a product	200, 400, 404
DELETE	/v1/products/{productId}	Delete a product	204, 404

Item Type API

HTTP Method	Endpoint	Description	Status Codes
GET	/v1/products/{productId}/item-types	List item types for a product	200, 404
GET	/v1/item-types/{itemTypeId}	Get a specific item type	200, 404
POST	/v1/products/{productId}/item-types	Create a new item type	201, 400, 409
PUT	/v1/item-types/{itemTypeId}	Update an item type	200, 400, 404
DELETE	/v1/item-types/{itemTypeId}	Delete an item type	204, 404

Item API

HTTP Method	Endpoint	Description	Status Codes
GET	/v1/products/{productId}/items	List items for a product	200, 404
GET	/v1/items/{itemId}	Get a specific item	200, 404
POST	/v1/products/{productId}/items	Create a new item	201, 400, 409
PUT	/v1/items/{itemId}	Update an item	200, 400, 404
DELETE	/v1/items/{itemId}	Delete an item	204, 404

Tag API

HTTP Method	Endpoint	Description	Status Codes
GET	/v1/tags	List all tags (paginated)	200
GET	/v1/tags/{tagId}	Get a specific tag	200, 404
POST	/v1/tags	Create a new tag	201, 400, 409
PUT	/v1/tags/{tagId}	Update a tag	200, 400, 404
DELETE	/v1/tags/{tagId}	Delete a tag	204, 404
GET	/v1/tags/popular	Get most popular tags (by views)	200

Item Tagging API

HTTP Method	Endpoint	Description	Status Codes
GET	/v1/items/{itemId}/tags	List tags for an item	200, 404
POST	/v1/items/{itemId}/tags	Add one or more tags to an item	201, 400, 404
DELETE	/v1/items/{itemId}/tags/{tagId}	Remove a tag from an item	204, 404
GET	/v1/tags/{tagId}/items	List items with a specific tag	200, 404
POST	/v1/tags/{tagId}/view	Record a tag view event	200, 400

🏗️ System Architecture

Three regions, each running its own stack:

Three regional deployments (EU, US, Asia) for low latency
Centralized Spanner database for strong consistency
Regional Redis clusters for caching and buffering
Regional Kafka clusters for local event streaming and processing
Regional Elasticsearch clusters for fast search and aggregation queries
Regional processors for batch operations and cache management

🚧 Core Challenges

Three hard problems show up the moment you try to build this globally:

1. Tag Creation and Tagging with Strong Consistency

Challenge: No duplicate tag creation or tagging across global regions
Requirement: Strong consistency with ~100ms latency globally
Complexity: Race conditions when multiple users create the same tag simultaneously

2. Product Discovery by Tags

Challenge: Clicking a tag should show all products attached to it
Requirement: Sub-25ms response time with complete results
Complexity: Data distributed across regions, cache consistency, search index synchronization

3. Tag Usage Analytics

Challenge: Tag clicks increment usage counts for popular tag rankings
Requirement: Handle millions of clicks with eventual consistency
Complexity: Hot key contention, cross-region aggregation

🔧 Solution 1: Tag Creation with Strong Consistency

Problem

Multiple users in different regions creating the same tag simultaneously could result in duplicate tags with different IDs, breaking content discovery.

Solution: Deterministic UUIDs + Database Constraints + Regional Caching

Architecture Pattern: Optimistic concurrency with deterministic conflict resolution

Step-by-step:

Generate deterministic UUID from hash(tenant_id + normalized_name) (~1ms)
Check regional Redis cache (~2ms). On cache hit, return immediately.
Insert into Spanner with unique constraint (~90ms). If constraint violation, fetch existing tag.
Update regional cache (~5ms)
Publish to regional Kafka for ES indexing and downstream consumers (~2ms)

Other regions pick up the new tag on their next cache miss. They query Spanner directly, and since Spanner replicates globally, the data is already there.

Benefits:

✅ Same tag name always generates the same UUID across regions
✅ Database unique constraint prevents duplicates atomically
✅ No global coordination needed, conflicts resolve naturally
✅ Consistent ~100ms latency regardless of user location

🔧 Solution 2: Product Discovery with Smart Caching

Problem

Users clicking on tags need to see all products tagged with that tag from all regions with sub-25ms response time.

Solution: Multi-Layer Caching + Search Index + Progressive Loading

Architecture Pattern: Tiered caching with search index optimization

Why Elasticsearch?

Fast aggregation queries across millions of records
Complex search patterns like "items with tag X created in last 30 days"
Regional deployment keeps latency low
Real-time indexing: new tag-item mappings indexed within 30 seconds
Cross-region replication every 2-3 minutes

Elasticsearch Document Structure:

json

{
  "tenant_id": "tenant123",
  "item_id": "item456",
  "tag_ids": ["tag789", "tag101"],
  "product_id": "product001",
  "item_type": "document",
  "created_at": "2025-05-25T10:00:00Z",
  "region": "us"
}

Flow:

London user clicks "marketing" tag:

1. Get tag info from Redis cache (~2ms):
   GET "tag:tenant123:marketing" → {tag_id: "abc-123"}

2. Check cached product list (~3ms):
   GET "tag_items:tenant123:abc-123" → Cache hit

3. Return products immediately (~15ms total)

Cache Miss Flow:
1. Get tag info from cache (~2ms)

2. Query Elasticsearch index (~18ms):
   ES query: {"query": {"term": {"tag_ids": "abc-123"}}}

3. Cache results (~3ms)

4. Return to user (~23ms total)

Elasticsearch Unavailable Flow:
1. Get tag info from cache (~2ms)

2. Fallback to Spanner database (~40ms):
   SELECT items.* FROM items
   JOIN item_tags ON items.id = item_tags.item_id
   WHERE item_tags.tag_id = 'abc-123'

3. Cache results (~3ms)

4. Return to user (~45ms total)

Search Index Strategy:

Pre-computed aggregations for popular tag-item combinations
Kafka events trigger ES index updates
Regional deployment across US, EU, Asia
Spanner as authoritative fallback when ES is unavailable

Performance Tiers:

Popular tags (95% of queries): ~15ms (Redis cache hit)
Active tags (4% of queries): ~23ms (Elasticsearch query)
Rare tags (1% of queries): ~45ms (Spanner database fallback)

🔧 Solution 3: Tag Usage Counting with Zero Data Loss

Problem

Tag clicks need to increment usage counters for popular tag rankings while handling millions of clicks without data loss or hot key contention.

Solution: Regional Buffers + Kafka Streaming + Atomic Batch Processing

Architecture Pattern: Regional buffering with eventual global consistency

Step-by-step:

User click hits regional API, Redis INCR returns immediately (~2ms)
Async event published to local Kafka
Every 30 seconds, batch processor atomically reads and resets the counter (GETSET), writes a backup, flushes to Spanner, then cleans up the backup
Every 2 minutes, global aggregator sums all regional events and updates lifetime stats

Zero Data Loss Mechanism:

GETSET atomically reads and resets the counter
Backup saved before database write
Backup deleted only after successful Spanner commit
On failure, backup value gets added back to the live counter

🔄 Data Flow & Consistency Model

Write Flow (Tag Creation → Content Tagging → Analytics)

1. Tag Creation:
   User Request → Spanner (authoritative) → Redis Cache → Kafka Event

2. Content Tagging:
   User Request → Spanner (item_tags) → Redis Cache → Elasticsearch Index

3. Tag Click Analytics:
   User Request → Redis Buffer → Kafka → Spanner (batch) → Cache Update

Cache Miss Handling

Cache Miss Flow:
1. Check Redis cache → Miss
2. Query Spanner database (authoritative source)
3. Populate cache with TTL (30 minutes for relationships, 1 hour for tags)
4. Return data to user
5. Warm related cache keys in background

Popular Tags Cache Miss:
1. Redis miss → Query Spanner tag_lifetime_stats
2. Aggregate with live regional counters from all regions
3. Cache sorted results with 5-minute TTL
4. Background job pre-warms popular tag caches

Cross-Region Analytics Aggregation

Global View Count Collection (every 2 minutes):
1. US Regional Processor (designated global aggregator):
   - Queries Spanner for unprocessed tag_increment_events
   - Groups by (tenant_id, tag_id): {tenant123_tag789: us=156, eu=89, asia=203}

2. Atomic Update:
   - UPDATE tag_lifetime_stats SET total_lifetime_views += 448
   - UPDATE tag_increment_events SET processed_globally=true

3. No Cross-Region Cache Invalidation:
   - Other regions discover updates through cache miss → Spanner query
   - Each region manages its own cache independently
   - Spanner serves as the coordination layer, not Kafka

Cache Invalidation Strategy

Immediate Invalidation (Strong Consistency Operations):
1. Tag Creation in Tokyo:
   - Spanner write succeeds → Update Tokyo Redis cache
   - Publish to Tokyo Kafka for local ES indexing

2. User in London queries same tag:
   - London Redis cache miss
   - London queries Spanner directly (data already globally replicated)
   - London populates its local cache
   - No cross-region coordination needed

Background Invalidation (Analytics):
1. Batch processing completes → Global stats updated
2. Kafka broadcast: {"invalidate": ["popular:*"]}
3. Regional processors clear popular tag caches
4. Next request rebuilds from fresh Spanner data

Elasticsearch Synchronization

Regional Index Updates:
1. Content tagged in Tokyo → Tokyo Kafka event
2. Tokyo ES consumer processes event within 30 seconds

Cross-Region ES Discovery:
1. London user searches → London ES query
2. If data not yet replicated, fallback to Spanner
3. Background job syncs ES data from Spanner every 5 minutes

ES Recovery:
1. ES cluster failure → Circuit breaker activates
2. Queries fall back to Spanner globally
3. Background job rebuilds ES index from Spanner
4. Circuit breaker reopens when ES passes health check

Consistency Guarantees

Strong consistency: Tag creation, content tagging (via Spanner)
Eventual consistency: Search indexes, analytics, cross-region caches
Cache coherence: Invalidation events ensure stale data cleanup

🔍 Identify Bottlenecks

1. Deterministic UUID Race Conditions

Within a single tenant, simultaneous creation of the same tag from two regions hits the UNIQUE constraint. The second insert gets a constraint violation, fetches the existing tag, and returns it. No coordination needed, but the losing request pays ~85ms for the extra round-trip.

2. Redis Cache Stampede

If a regional Redis cluster fails, all tag lookups fall through to Spanner simultaneously. Circuit breakers limit concurrent Spanner queries and serve from a local in-memory cache of recently accessed tags while Redis recovers.

3. Elasticsearch Index Lag

New tag-item mappings take up to 30 seconds to appear in the search index. Users who tag content and immediately search may not see their items. The fallback path queries Spanner directly when results look incomplete.

4. Hot Tag Counter Contention

A viral tag could receive thousands of clicks per second in one region. Redis INCR handles this fine, but the 30-second batch flush means global totals lag behind. That's the trade-off we accepted when we chose eventual consistency for analytics.

5. Cross-Region Cache Cold Start

When a tag created in Tokyo gets accessed in London for the first time, the London cache miss adds ~40ms for the Spanner round-trip. Popular tags warm quickly through organic traffic. Long-tail tags always pay this cost on first access per region.

6. Kafka Consumer Lag During Spikes

If Elasticsearch indexing slows down, Kafka consumer lag grows. That's fine. Kafka holds onto the events, consumers catch up once ES recovers. You lose search freshness for a bit, not data.

📈 Performance Characteristics

Latency Breakdown

Operation	Cache Hit	Cache Miss	Notes
Tag Creation (new)	N/A	~100ms	Strong consistency globally
Tag Creation (duplicate)	~85ms	~85ms	Returns existing tag
Content Tagging	~40ms	~75ms	Strong consistency
Product Discovery	~15ms	~45ms	95% cache hit rate
Tag Click Response	~2ms	~2ms	Immediate user feedback
Popular Dashboard	~5ms	~150ms	Real-time rankings

Scalability Numbers

Per Region Capacity:

Tag creations: ~1,000/second
Content tagging: ~10,000/second
Tag clicks: ~100,000/second
Product queries: ~50,000/second
Dashboard requests: ~10,000/second

Global Capacity (3 regions):

Total: 3x regional capacity with linear scaling
Cross-region consistency: ~2-3 minutes maximum

🛡️ Resilience & Failure Handling

Failure Taxonomy

Component	Failure Mode	Detection	Recovery	Data Loss Risk
Redis (regional)	Node crash, network split	Health check (5s)	Circuit breaker, fall back to Spanner	None (Spanner is source of truth)
Kafka (regional)	Broker failure	Consumer lag spike	Partition rebalance, in-memory buffer	None (RF=3, acks=all)
Spanner (regional)	AZ outage	Automatic detection	Failover to healthy regions	None (synchronous replication)
Elasticsearch	Cluster degradation	Latency threshold	Circuit breaker, fall back to Spanner	None (ES is derived state)

Retry Strategy

Tag creation: No retry needed. Deterministic UUIDs mean a repeated attempt produces the same result.
Click counting: Redis backup key preserves count if the batch flush to Spanner fails. Recovery adds the backup to the live counter.
ES indexing: Kafka retains events. Failed indexing retries from the consumer offset.

Redis Failure

1. Circuit breaker activates
2. Fall back to direct Spanner queries (+~60ms latency)
3. Queue cache warming tasks
4. Resume normal operation when Redis recovers

Kafka Failure

1. Regional processing continues with local buffering
2. Queue failed event publishes in memory
3. Replay queued events when Kafka recovers
4. No data loss due to local backup mechanisms

Spanner Regional Outage

1. Automatic failover to healthy regions
2. Writes continue from operational regions
3. Read traffic redirected to available replicas
4. Global consistency maintained

🚀 Deployment Strategy

Multi-Region Kubernetes:

Three regional clusters (US, EU, Asia) behind global DNS with latency-based routing
Each region runs: API pods, batch processors, Redis cluster, Kafka cluster, ES cluster
Spanner spans all regions as a single global instance

Release Process:

Rolling deployments with readiness probes. No traffic until the pod is healthy.
Canary releases: 5% traffic for 15 minutes, automated rollback on error rate spike
Blue-green for database migrations with backward-compatible schema changes

Technology Stack:

Component	Technology
Orchestration	Kubernetes (GKE)
API Gateway	Envoy + gRPC
Monitoring	Prometheus + Grafana
CI/CD	ArgoCD + GitHub Actions
Load Balancer	Cloud L4 LB + Envoy sidecar (L7)

📡 Observability

Key Metrics:

Tags:        tag.created.rate, tag.duplicate.rate, creation.latency.p99
Discovery:   discovery.cache_hit.rate, es.query.latency.p50/p99
Analytics:   click.ingested.rate, batch.flush.rate, aggregation.lag.seconds
Cache:       redis.hit_rate, redis.eviction.rate (per region)
Kafka:       consumer.lag (per topic), produce.error.rate

Critical Alerts:

yaml

- alert: TagCreationLatency > 200ms     # P1 page
  for: 2m
- alert: ESConsumerLag > 100K           # P2 warning
  for: 5m
- alert: RedisHitRate < 85%             # P2 warning
  for: 10m
- alert: SpannerErrorRate > 0.1%        # P1 page
  for: 1m

Tracing: Every tag operation gets a trace ID that follows it through the whole pipeline. Sample 1% of normal traffic, 100% of errors and anything slow. Jaeger or Grafana Tempo on the backend.

🔐 Security

Authentication: Tenant API keys plus JWT tokens with a 15-minute TTL. Every endpoint requires a valid tenant context. Cross-tenant access can't happen because tenant_id is baked into every query and constraint at the database level.

Encryption: TLS 1.3 on all external traffic. mTLS between services. Spanner handles encryption at rest out of the box. Redis runs with TLS for inter-node traffic.

Tenant Isolation: Every row, cache key, and Kafka message carries a tenant_id. The API layer enforces tenant_id filtering on all Spanner queries. Per-tenant rate limits keep noisy neighbors from ruining it for everyone else.

🎯 Key Design Decisions & Trade-offs

Aspect	Decision	Alternative	Rationale
Tag Consistency	Strong consistency via Spanner	Eventual consistency	Business requirement for duplicate prevention
View Counting	Eventual consistency via buffering	Strong consistency	Performance over accuracy for analytics
Caching Strategy	Multi-tier with invalidation	Write-through only	Optimizes for read-heavy workload
Conflict Resolution	Deterministic UUIDs	Global locking	Eliminates coordination overhead
Regional Strategy	Regional processing + global DB	Full replication	Balances performance with consistency

🏁 Conclusion

Here's what the three solutions actually buy us:

Tag operations land at ~104ms globally with zero duplicates. Deterministic UUIDs plus database constraints do the heavy lifting. No coordination protocol needed.

Product discovery hits ~18ms for 95% of queries through multi-tier caching. When caches miss, Elasticsearch picks it up. When ES is down, Spanner is always there.

Analytics responds in ~2ms to the user while batching counts in the background. The atomic backup mechanism means you don't lose clicks even when things go wrong.

The big takeaway: don't use one consistency model for everything. Tag creation needs strong consistency. Discovery is a caching problem. Analytics can be eventual. Match the guarantee to the actual requirement, and you avoid paying for consistency you don't need.

Explore the Technologies

Want to go deeper on any of these?

Core Technologies

Technology	Role in This System	Learn More
Google Cloud Spanner	Primary database: global strong consistency, multi-region writes, interleaved tables for tag-content relationships	Spanner
Redis	Multi-tier caching: regional click buffering, tag lookup cache, sub-ms reads for 95% of queries	Redis
Kafka	Event streaming: tag operation events, async click analytics pipeline, cross-service decoupling	Kafka
Elasticsearch	Full-text tag search: popular tags aggregation, typeahead suggestions, discovery queries	Elasticsearch

Infrastructure Patterns

Pattern	Role in This System	Learn More
Caching Strategies	Write-through cache with invalidation, multi-tier regional caches, cache warming	Caching Strategies
Database Sharding	Spanner's split-based auto-sharding for global data distribution	Database Sharding
Replication & Consistency	Strong consistency for tag writes, eventual consistency for analytics counters	Replication & Consistency
Message Queues & Event Streaming	Kafka topic design for tag events and click analytics pipelines	Event Streaming

Practice this design: Design a Multi-Tenant SaaS Platform -- interview question with progressive hints.

Variations of this architecture run in production at multi-tenant SaaS platforms, content management systems, and social platforms that tag content at global scale.

#systemdesign #distributed-systems #saas #multitenancy #microservices #redis #architecture #kafka #scalability

CrackingWalnuts

System Design: Job Scheduler (10M Jobs/day, DAG Dependencies, Effectively-Once Execution)

April 15, 2026 · 61 min read

System Design: LeetCode (Code Sandbox, Container Isolation, Real-Time Contests)

April 12, 2026 · 68 min read

System Design: URL Shortener (10B Short URLs, 100K Redirects/sec)

April 11, 2026 · 42 min read

Continue Learning

Explore 30+ topics in System Design Interview Prep→

Deep dives, diagrams, and interview-ready knowledge.

CrackingWalnuts

System DesignMay 25, 2025· 23 min read

Designing a Multi-Tenant, Globally Scalable Tagging System

Goal: Build a multi-tenant tagging platform that prevents duplicate tags globally, serves product discovery in under 25ms, and tracks tag usage across millions of clicks with zero data loss. Scale: 10K tenants, 200K concurrent users across three regions.

🎯 Functional Requirements

Requirement	Details
Multi-tenancy	Each tenant independently manages its tags and content
Global availability	Low latency access for users worldwide
Tag lifecycle	Create, update, delete tags
Content tagging	Associate and dissociate tags from content
Popular tags dashboard	Show most-used tags by lifetime view counts
Scale	Handle up to 10K concurrent global users

⚡ Non-Functional Requirements

Requirement	Target	Priority
Users distributed across multiple regions	US, EU, Asia	P0
Tag creation latency for all users	~100ms regardless of location	P0
Prevent duplicate tag creation globally	Same tag name = same tag ID	P0
All users see same tag state after creation	Strong consistency	P0
Tag creation and content-tag mapping consistency	ACID across regions	P0
View count metrics consistency	Eventual (~2 min delay acceptable)	P1
Resilience to infrastructure failures	Auto-failover, zero data loss	P1

📊 Back-of-the-Envelope Calculations

User Base Assumptions

Total tenants: 10,000
Average users per tenant: 200
Total user base: 2,000,000 users
Concurrent users at peak: 200,000 (10% of total users)
Geographic distribution: 40% Americas, 30% Europe, 30% Asia-Pacific

Tag Operations (Control Plane)

Tag Creation:
- Average tags created per tenant per day: 20
- Total daily tag creations: 10,000 tenants × 20 = 200,000
- Peak tag creation rate: 200,000 ÷ 8 business hours ÷ 3600 seconds × 5 peak factor = ~35 tags/second
- Average request size: 1 KB
- Tag create bandwidth: 35 KB/second
Item-Tag Mappings:
- Average mappings per tenant per day: 2,000
- Total daily mappings: 10,000 tenants × 2,000 = 20,000,000
- Peak mapping rate: 20,000,000 ÷ 24 hours ÷ 3600 seconds × 3 peak factor = ~700 mappings/second
- Average request size: 0.8 KB
- Item-tag mapping bandwidth: ~560 KB/second

View Events (Data Plane)

View Events Per User:
- Average views per active user per day: 100
- Total daily users: 10,000 tenants × 200 users = 2,000,000
- Total daily views: 2,000,000 users × 100 views = 200,000,000
- Peak view rate: 200,000,000 ÷ 24 hours ÷ 3600 seconds × 4 peak factor = ~9,300 views/second
- Average event size: 0.5 KB
- View event bandwidth: ~4,650 KB/second

API Request Breakdown

Read Operations (90% of traffic):
- Tag lookups: ~4,000/second (~400 KB/second)
- Item lookups: ~6,000/second (~1,200 KB/second)
- Popular tag queries: ~1,000/second (~300 KB/second)
- Tagged item lookups: ~2,000/second (~600 KB/second)
Write Operations (10% of traffic):
- Tag creations: ~35/second (~35 KB/second)
- Item-tag mappings: ~700/second (~560 KB/second)
- View events: ~9,300/second (~4,650 KB/second)

Total System Capacity

Peak Inbound Bandwidth: ~7,700 KB/second (≈ 62 Mbps)
Peak Outbound Bandwidth: ~23,200 KB/second (≈ 186 Mbps)
Daily Data Transfer: ~2.5 TB

Storage Requirements Growth (Per Year)

Average Record Sizes:

Tag record: ~350 bytes (including all metadata, timestamps, UUIDs)
Item-Tag mapping: ~260 bytes (relationship with metadata)
View event: ~145 bytes (analytics data)

Annual Storage Growth:

Tags: 200,000 tags/day × 365 days × 350 bytes = ~25.6 GB/year
Item-Tag Mappings: 20M mappings/day × 365 days × 260 bytes = ~1.9 TB/year
View Events: 200M views/day × 365 days × 145 bytes = ~10.6 TB/year
Total Annual Storage Growth: ~12.5 TB/year

🗄️ Database Design

Store Selection Rationale

Store	Role	Why This Choice
Spanner	Primary DB, source of truth	Global strong consistency via TrueTime, multi-master writes, no manual sharding. Only option that gives ACID across 3 regions without operational overhead.
Redis	Regional cache + click buffer	Sub-ms reads for 95% cache-hit rate. INCR for atomic click counting. Disposable: Spanner is the fallback.
Kafka	Regional event streaming	Decouples tag operations from ES indexing and analytics. Durable buffer if downstream is slow. Regional topics avoid cross-region latency.
Elasticsearch	Tag search + aggregation	Handles "find all items with tag X" across millions of records in ~18ms. Complex filters (date range, type). Regional clusters keep latency local.

Alternative Database Comparison

Database	Global Consistency	Multi-Region	Multi-Master Writes	Scalability	Operational Overhead	Decision
PostgreSQL	❌ Limited	❌ Complex setup	❌ Single master	⚠️ Vertical scaling	🔴 High	Rejected
MongoDB	❌ Eventual only	⚠️ Manual setup	⚠️ Limited (sharding)	✅ Good	⚠️ Medium	Rejected
DynamoDB	⚠️ Eventually consistent	✅ Native	✅ Global tables	✅ Excellent	✅ Low	Considered
CockroachDB	✅ Strong	✅ Native	✅ Multi-region writes	✅ Good	⚠️ Medium	Considered
Spanner	✅ Strong + External	✅ Native	✅ Multi-region writes	✅ Excellent	✅ Minimal	Selected

Schema

sql

-- Products table
CREATE TABLE products (
    id STRING(36) NOT NULL,
    name STRING(255) NOT NULL,
    description TEXT,
    tenant_id STRING(36) NOT NULL,
    status STRING(20) NOT NULL DEFAULT 'active',
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
    created_by STRING(36) NOT NULL,
    updated_by STRING(36) NOT NULL,
) PRIMARY KEY (id);

-- Item types table
CREATE TABLE item_types (
    id STRING(36) NOT NULL,
    name STRING(255) NOT NULL,
    product_id STRING(36) NOT NULL,
    tenant_id STRING(36) NOT NULL,
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
    created_by STRING(36) NOT NULL,
    updated_by STRING(36) NOT NULL,
) PRIMARY KEY (id),
CONSTRAINT fk_item_type_product FOREIGN KEY (product_id) REFERENCES products (id);

-- Items/Content table
CREATE TABLE items (
    id STRING(36) NOT NULL,
    item_type_id STRING(36) NOT NULL,
    product_id STRING(36) NOT NULL,
    tenant_id STRING(36) NOT NULL,
    external_id STRING(255) NOT NULL, -- External system reference
    name STRING(255) NOT NULL,
    description TEXT,
    content_type STRING(50) NOT NULL, -- 'document', 'image', 'video', etc.
    status STRING(20) NOT NULL DEFAULT 'active',
    created_region STRING(20) NOT NULL, -- Track creation region
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
    created_by STRING(36) NOT NULL,
    updated_by STRING(36) NOT NULL,
) PRIMARY KEY (id),
CONSTRAINT fk_item_type FOREIGN KEY (item_type_id) REFERENCES item_types (id),
CONSTRAINT fk_item_product FOREIGN KEY (product_id) REFERENCES products (id),
CONSTRAINT unique_external_item UNIQUE (tenant_id, external_id);

-- Tags table with global uniqueness per tenant
CREATE TABLE tags (
    id STRING(36) NOT NULL, -- Deterministic UUID based on tenant_id + normalized_name
    tenant_id STRING(36) NOT NULL,
    normalized_name STRING(255) NOT NULL, -- Lowercase, trimmed version
    display_name STRING(255) NOT NULL, -- Original user input
    description TEXT,
    status STRING(20) NOT NULL DEFAULT 'active',
    created_region STRING(20) NOT NULL,
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
    created_by STRING(36) NOT NULL,
    updated_by STRING(36) NOT NULL,
) PRIMARY KEY (id),
CONSTRAINT unique_tag_per_tenant UNIQUE (tenant_id, normalized_name);

-- Item-Tag mappings table with strong consistency
CREATE TABLE item_tags (
    id STRING(36) NOT NULL,
    tenant_id STRING(36) NOT NULL,
    tag_id STRING(36) NOT NULL,
    item_id STRING(36) NOT NULL,
    created_region STRING(20) NOT NULL,
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
    created_by STRING(36) NOT NULL,
    updated_by STRING(36) NOT NULL,
) PRIMARY KEY (id),
CONSTRAINT fk_item_tag_tag FOREIGN KEY (tag_id) REFERENCES tags (id),
CONSTRAINT fk_item_tag_item FOREIGN KEY (item_id) REFERENCES items (id),
CONSTRAINT unique_item_tag UNIQUE (tenant_id, item_id, tag_id);

-- Regional tag increment events for analytics
CREATE TABLE tag_increment_events (
    event_id STRING(36) NOT NULL,
    tenant_id STRING(36) NOT NULL,
    tag_id STRING(36) NOT NULL,
    region STRING(20) NOT NULL,
    increment_amount INT64 NOT NULL,
    event_timestamp TIMESTAMP NOT NULL,
    processed_globally BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP NOT NULL,
) PRIMARY KEY (event_id);

-- Global lifetime tag statistics
CREATE TABLE tag_lifetime_stats (
    tenant_id STRING(36) NOT NULL,
    tag_id STRING(36) NOT NULL,
    total_lifetime_views INT64 NOT NULL DEFAULT 0,
    last_aggregated_at TIMESTAMP NOT NULL,
    regional_breakdown JSON NOT NULL, -- {"us": 1234, "eu": 567, "asia": 890}
    created_at TIMESTAMP NOT NULL,
    updated_at TIMESTAMP NOT NULL,
) PRIMARY KEY (tenant_id, tag_id),
CONSTRAINT fk_tag_stats_tag FOREIGN KEY (tag_id) REFERENCES tags (id);

-- Performance optimization indexes
CREATE INDEX idx_items_tenant_type ON items (tenant_id, item_type_id);
CREATE INDEX idx_item_tags_tenant_tag ON item_tags (tenant_id, tag_id);
CREATE INDEX idx_item_tags_tenant_item ON item_tags (tenant_id, item_id);
CREATE INDEX idx_tags_tenant_status ON tags (tenant_id, status);
CREATE INDEX idx_tag_events_processing ON tag_increment_events (processed_globally, created_at);

📚 RESTful API Design

Product API

HTTP Method	Endpoint	Description	Status Codes
GET	/v1/products	List all products (paginated)	200
GET	/v1/products/{productId}	Get a specific product	200, 404
POST	/v1/products	Create a new product	201, 400, 409
PUT	/v1/products/{productId}	Update a product	200, 400, 404
DELETE	/v1/products/{productId}	Delete a product	204, 404

Item Type API

HTTP Method	Endpoint	Description	Status Codes
GET	/v1/products/{productId}/item-types	List item types for a product	200, 404
GET	/v1/item-types/{itemTypeId}	Get a specific item type	200, 404
POST	/v1/products/{productId}/item-types	Create a new item type	201, 400, 409
PUT	/v1/item-types/{itemTypeId}	Update an item type	200, 400, 404
DELETE	/v1/item-types/{itemTypeId}	Delete an item type	204, 404

Item API

HTTP Method	Endpoint	Description	Status Codes
GET	/v1/products/{productId}/items	List items for a product	200, 404
GET	/v1/items/{itemId}	Get a specific item	200, 404
POST	/v1/products/{productId}/items	Create a new item	201, 400, 409
PUT	/v1/items/{itemId}	Update an item	200, 400, 404
DELETE	/v1/items/{itemId}	Delete an item	204, 404

Tag API

HTTP Method	Endpoint	Description	Status Codes
GET	/v1/tags	List all tags (paginated)	200
GET	/v1/tags/{tagId}	Get a specific tag	200, 404
POST	/v1/tags	Create a new tag	201, 400, 409
PUT	/v1/tags/{tagId}	Update a tag	200, 400, 404
DELETE	/v1/tags/{tagId}	Delete a tag	204, 404
GET	/v1/tags/popular	Get most popular tags (by views)	200

Item Tagging API

HTTP Method	Endpoint	Description	Status Codes
GET	/v1/items/{itemId}/tags	List tags for an item	200, 404
POST	/v1/items/{itemId}/tags	Add one or more tags to an item	201, 400, 404
DELETE	/v1/items/{itemId}/tags/{tagId}	Remove a tag from an item	204, 404
GET	/v1/tags/{tagId}/items	List items with a specific tag	200, 404
POST	/v1/tags/{tagId}/view	Record a tag view event	200, 400

🏗️ System Architecture

Three regions, each running its own stack:

Three regional deployments (EU, US, Asia) for low latency
Centralized Spanner database for strong consistency
Regional Redis clusters for caching and buffering
Regional Kafka clusters for local event streaming and processing
Regional Elasticsearch clusters for fast search and aggregation queries
Regional processors for batch operations and cache management

🚧 Core Challenges

Three hard problems show up the moment you try to build this globally:

1. Tag Creation and Tagging with Strong Consistency

Challenge: No duplicate tag creation or tagging across global regions
Requirement: Strong consistency with ~100ms latency globally
Complexity: Race conditions when multiple users create the same tag simultaneously

2. Product Discovery by Tags

Challenge: Clicking a tag should show all products attached to it
Requirement: Sub-25ms response time with complete results
Complexity: Data distributed across regions, cache consistency, search index synchronization

3. Tag Usage Analytics

Challenge: Tag clicks increment usage counts for popular tag rankings
Requirement: Handle millions of clicks with eventual consistency
Complexity: Hot key contention, cross-region aggregation

🔧 Solution 1: Tag Creation with Strong Consistency

Problem

Multiple users in different regions creating the same tag simultaneously could result in duplicate tags with different IDs, breaking content discovery.

Solution: Deterministic UUIDs + Database Constraints + Regional Caching

Architecture Pattern: Optimistic concurrency with deterministic conflict resolution

Step-by-step:

Generate deterministic UUID from hash(tenant_id + normalized_name) (~1ms)
Check regional Redis cache (~2ms). On cache hit, return immediately.
Insert into Spanner with unique constraint (~90ms). If constraint violation, fetch existing tag.
Update regional cache (~5ms)
Publish to regional Kafka for ES indexing and downstream consumers (~2ms)

Other regions pick up the new tag on their next cache miss. They query Spanner directly, and since Spanner replicates globally, the data is already there.

Benefits:

✅ Same tag name always generates the same UUID across regions
✅ Database unique constraint prevents duplicates atomically
✅ No global coordination needed, conflicts resolve naturally
✅ Consistent ~100ms latency regardless of user location

🔧 Solution 2: Product Discovery with Smart Caching

Problem

Users clicking on tags need to see all products tagged with that tag from all regions with sub-25ms response time.

Solution: Multi-Layer Caching + Search Index + Progressive Loading

Architecture Pattern: Tiered caching with search index optimization

Why Elasticsearch?

Fast aggregation queries across millions of records
Complex search patterns like "items with tag X created in last 30 days"
Regional deployment keeps latency low
Real-time indexing: new tag-item mappings indexed within 30 seconds
Cross-region replication every 2-3 minutes

Elasticsearch Document Structure:

json

{
  "tenant_id": "tenant123",
  "item_id": "item456",
  "tag_ids": ["tag789", "tag101"],
  "product_id": "product001",
  "item_type": "document",
  "created_at": "2025-05-25T10:00:00Z",
  "region": "us"
}

Flow:

London user clicks "marketing" tag:

1. Get tag info from Redis cache (~2ms):
   GET "tag:tenant123:marketing" → {tag_id: "abc-123"}

2. Check cached product list (~3ms):
   GET "tag_items:tenant123:abc-123" → Cache hit

3. Return products immediately (~15ms total)

Cache Miss Flow:
1. Get tag info from cache (~2ms)

2. Query Elasticsearch index (~18ms):
   ES query: {"query": {"term": {"tag_ids": "abc-123"}}}

3. Cache results (~3ms)

4. Return to user (~23ms total)

Elasticsearch Unavailable Flow:
1. Get tag info from cache (~2ms)

2. Fallback to Spanner database (~40ms):
   SELECT items.* FROM items
   JOIN item_tags ON items.id = item_tags.item_id
   WHERE item_tags.tag_id = 'abc-123'

3. Cache results (~3ms)

4. Return to user (~45ms total)

Search Index Strategy:

Pre-computed aggregations for popular tag-item combinations
Kafka events trigger ES index updates
Regional deployment across US, EU, Asia
Spanner as authoritative fallback when ES is unavailable

Performance Tiers:

Popular tags (95% of queries): ~15ms (Redis cache hit)
Active tags (4% of queries): ~23ms (Elasticsearch query)
Rare tags (1% of queries): ~45ms (Spanner database fallback)

🔧 Solution 3: Tag Usage Counting with Zero Data Loss

Problem

Tag clicks need to increment usage counters for popular tag rankings while handling millions of clicks without data loss or hot key contention.

Solution: Regional Buffers + Kafka Streaming + Atomic Batch Processing

Architecture Pattern: Regional buffering with eventual global consistency

Step-by-step:

User click hits regional API, Redis INCR returns immediately (~2ms)
Async event published to local Kafka
Every 30 seconds, batch processor atomically reads and resets the counter (GETSET), writes a backup, flushes to Spanner, then cleans up the backup
Every 2 minutes, global aggregator sums all regional events and updates lifetime stats

Zero Data Loss Mechanism:

GETSET atomically reads and resets the counter
Backup saved before database write
Backup deleted only after successful Spanner commit
On failure, backup value gets added back to the live counter

🔄 Data Flow & Consistency Model

Write Flow (Tag Creation → Content Tagging → Analytics)

1. Tag Creation:
   User Request → Spanner (authoritative) → Redis Cache → Kafka Event

2. Content Tagging:
   User Request → Spanner (item_tags) → Redis Cache → Elasticsearch Index

3. Tag Click Analytics:
   User Request → Redis Buffer → Kafka → Spanner (batch) → Cache Update

Cache Miss Handling

Cache Miss Flow:
1. Check Redis cache → Miss
2. Query Spanner database (authoritative source)
3. Populate cache with TTL (30 minutes for relationships, 1 hour for tags)
4. Return data to user
5. Warm related cache keys in background

Popular Tags Cache Miss:
1. Redis miss → Query Spanner tag_lifetime_stats
2. Aggregate with live regional counters from all regions
3. Cache sorted results with 5-minute TTL
4. Background job pre-warms popular tag caches

Cross-Region Analytics Aggregation

Global View Count Collection (every 2 minutes):
1. US Regional Processor (designated global aggregator):
   - Queries Spanner for unprocessed tag_increment_events
   - Groups by (tenant_id, tag_id): {tenant123_tag789: us=156, eu=89, asia=203}

2. Atomic Update:
   - UPDATE tag_lifetime_stats SET total_lifetime_views += 448
   - UPDATE tag_increment_events SET processed_globally=true

3. No Cross-Region Cache Invalidation:
   - Other regions discover updates through cache miss → Spanner query
   - Each region manages its own cache independently
   - Spanner serves as the coordination layer, not Kafka

Cache Invalidation Strategy

Immediate Invalidation (Strong Consistency Operations):
1. Tag Creation in Tokyo:
   - Spanner write succeeds → Update Tokyo Redis cache
   - Publish to Tokyo Kafka for local ES indexing

2. User in London queries same tag:
   - London Redis cache miss
   - London queries Spanner directly (data already globally replicated)
   - London populates its local cache
   - No cross-region coordination needed

Background Invalidation (Analytics):
1. Batch processing completes → Global stats updated
2. Kafka broadcast: {"invalidate": ["popular:*"]}
3. Regional processors clear popular tag caches
4. Next request rebuilds from fresh Spanner data

Elasticsearch Synchronization

Regional Index Updates:
1. Content tagged in Tokyo → Tokyo Kafka event
2. Tokyo ES consumer processes event within 30 seconds

Cross-Region ES Discovery:
1. London user searches → London ES query
2. If data not yet replicated, fallback to Spanner
3. Background job syncs ES data from Spanner every 5 minutes

ES Recovery:
1. ES cluster failure → Circuit breaker activates
2. Queries fall back to Spanner globally
3. Background job rebuilds ES index from Spanner
4. Circuit breaker reopens when ES passes health check

Consistency Guarantees

Strong consistency: Tag creation, content tagging (via Spanner)
Eventual consistency: Search indexes, analytics, cross-region caches
Cache coherence: Invalidation events ensure stale data cleanup

🔍 Identify Bottlenecks

1. Deterministic UUID Race Conditions

2. Redis Cache Stampede

3. Elasticsearch Index Lag

4. Hot Tag Counter Contention

5. Cross-Region Cache Cold Start

6. Kafka Consumer Lag During Spikes

If Elasticsearch indexing slows down, Kafka consumer lag grows. That's fine. Kafka holds onto the events, consumers catch up once ES recovers. You lose search freshness for a bit, not data.

📈 Performance Characteristics

Latency Breakdown

Operation	Cache Hit	Cache Miss	Notes
Tag Creation (new)	N/A	~100ms	Strong consistency globally
Tag Creation (duplicate)	~85ms	~85ms	Returns existing tag
Content Tagging	~40ms	~75ms	Strong consistency
Product Discovery	~15ms	~45ms	95% cache hit rate
Tag Click Response	~2ms	~2ms	Immediate user feedback
Popular Dashboard	~5ms	~150ms	Real-time rankings

Scalability Numbers

Per Region Capacity:

Tag creations: ~1,000/second
Content tagging: ~10,000/second
Tag clicks: ~100,000/second
Product queries: ~50,000/second
Dashboard requests: ~10,000/second

Global Capacity (3 regions):

Total: 3x regional capacity with linear scaling
Cross-region consistency: ~2-3 minutes maximum

🛡️ Resilience & Failure Handling

Failure Taxonomy

Component	Failure Mode	Detection	Recovery	Data Loss Risk
Redis (regional)	Node crash, network split	Health check (5s)	Circuit breaker, fall back to Spanner	None (Spanner is source of truth)
Kafka (regional)	Broker failure	Consumer lag spike	Partition rebalance, in-memory buffer	None (RF=3, acks=all)
Spanner (regional)	AZ outage	Automatic detection	Failover to healthy regions	None (synchronous replication)
Elasticsearch	Cluster degradation	Latency threshold	Circuit breaker, fall back to Spanner	None (ES is derived state)

Retry Strategy

Tag creation: No retry needed. Deterministic UUIDs mean a repeated attempt produces the same result.
Click counting: Redis backup key preserves count if the batch flush to Spanner fails. Recovery adds the backup to the live counter.
ES indexing: Kafka retains events. Failed indexing retries from the consumer offset.

Redis Failure

1. Circuit breaker activates
2. Fall back to direct Spanner queries (+~60ms latency)
3. Queue cache warming tasks
4. Resume normal operation when Redis recovers

Kafka Failure

1. Regional processing continues with local buffering
2. Queue failed event publishes in memory
3. Replay queued events when Kafka recovers
4. No data loss due to local backup mechanisms

Spanner Regional Outage

1. Automatic failover to healthy regions
2. Writes continue from operational regions
3. Read traffic redirected to available replicas
4. Global consistency maintained

🚀 Deployment Strategy

Multi-Region Kubernetes:

Three regional clusters (US, EU, Asia) behind global DNS with latency-based routing
Each region runs: API pods, batch processors, Redis cluster, Kafka cluster, ES cluster
Spanner spans all regions as a single global instance

Release Process:

Rolling deployments with readiness probes. No traffic until the pod is healthy.
Canary releases: 5% traffic for 15 minutes, automated rollback on error rate spike
Blue-green for database migrations with backward-compatible schema changes

Technology Stack:

Component	Technology
Orchestration	Kubernetes (GKE)
API Gateway	Envoy + gRPC
Monitoring	Prometheus + Grafana
CI/CD	ArgoCD + GitHub Actions
Load Balancer	Cloud L4 LB + Envoy sidecar (L7)

📡 Observability

Key Metrics:

Tags:        tag.created.rate, tag.duplicate.rate, creation.latency.p99
Discovery:   discovery.cache_hit.rate, es.query.latency.p50/p99
Analytics:   click.ingested.rate, batch.flush.rate, aggregation.lag.seconds
Cache:       redis.hit_rate, redis.eviction.rate (per region)
Kafka:       consumer.lag (per topic), produce.error.rate

Critical Alerts:

yaml

- alert: TagCreationLatency > 200ms     # P1 page
  for: 2m
- alert: ESConsumerLag > 100K           # P2 warning
  for: 5m
- alert: RedisHitRate < 85%             # P2 warning
  for: 10m
- alert: SpannerErrorRate > 0.1%        # P1 page
  for: 1m

Tracing: Every tag operation gets a trace ID that follows it through the whole pipeline. Sample 1% of normal traffic, 100% of errors and anything slow. Jaeger or Grafana Tempo on the backend.

🔐 Security

Encryption: TLS 1.3 on all external traffic. mTLS between services. Spanner handles encryption at rest out of the box. Redis runs with TLS for inter-node traffic.

🎯 Key Design Decisions & Trade-offs

Aspect	Decision	Alternative	Rationale
Tag Consistency	Strong consistency via Spanner	Eventual consistency	Business requirement for duplicate prevention
View Counting	Eventual consistency via buffering	Strong consistency	Performance over accuracy for analytics
Caching Strategy	Multi-tier with invalidation	Write-through only	Optimizes for read-heavy workload
Conflict Resolution	Deterministic UUIDs	Global locking	Eliminates coordination overhead
Regional Strategy	Regional processing + global DB	Full replication	Balances performance with consistency

🏁 Conclusion

Here's what the three solutions actually buy us:

Tag operations land at ~104ms globally with zero duplicates. Deterministic UUIDs plus database constraints do the heavy lifting. No coordination protocol needed.

Product discovery hits ~18ms for 95% of queries through multi-tier caching. When caches miss, Elasticsearch picks it up. When ES is down, Spanner is always there.

Analytics responds in ~2ms to the user while batching counts in the background. The atomic backup mechanism means you don't lose clicks even when things go wrong.

Explore the Technologies

Want to go deeper on any of these?

Core Technologies

Technology	Role in This System	Learn More
Google Cloud Spanner	Primary database: global strong consistency, multi-region writes, interleaved tables for tag-content relationships	Spanner
Redis	Multi-tier caching: regional click buffering, tag lookup cache, sub-ms reads for 95% of queries	Redis
Kafka	Event streaming: tag operation events, async click analytics pipeline, cross-service decoupling	Kafka
Elasticsearch	Full-text tag search: popular tags aggregation, typeahead suggestions, discovery queries	Elasticsearch

Infrastructure Patterns

Pattern	Role in This System	Learn More
Caching Strategies	Write-through cache with invalidation, multi-tier regional caches, cache warming	Caching Strategies
Database Sharding	Spanner's split-based auto-sharding for global data distribution	Database Sharding
Replication & Consistency	Strong consistency for tag writes, eventual consistency for analytics counters	Replication & Consistency
Message Queues & Event Streaming	Kafka topic design for tag events and click analytics pipelines	Event Streaming

Practice this design: Design a Multi-Tenant SaaS Platform -- interview question with progressive hints.

Variations of this architecture run in production at multi-tenant SaaS platforms, content management systems, and social platforms that tag content at global scale.

#systemdesign #distributed-systems #saas #multitenancy #microservices #redis #architecture #kafka #scalability

CrackingWalnuts

System Design: Job Scheduler (10M Jobs/day, DAG Dependencies, Effectively-Once Execution)

April 15, 2026 · 61 min read

System Design: LeetCode (Code Sandbox, Container Isolation, Real-Time Contests)

April 12, 2026 · 68 min read

System Design: URL Shortener (10B Short URLs, 100K Redirects/sec)

April 11, 2026 · 42 min read

Continue Learning

Explore 30+ topics in System Design Interview Prep→

Deep dives, diagrams, and interview-ready knowledge.

Designing a Multi-Tenant, Globally Scalable Tagging System

🎯 Functional Requirements

⚡ Non-Functional Requirements

📊 Back-of-the-Envelope Calculations

User Base Assumptions

Tag Operations (Control Plane)

View Events (Data Plane)

API Request Breakdown

Total System Capacity

Storage Requirements Growth (Per Year)

🗄️ Database Design

Store Selection Rationale

Alternative Database Comparison

Schema

📚 RESTful API Design

Product API

Item Type API

Item API

Tag API

Item Tagging API

🏗️ System Architecture

🚧 Core Challenges

1. Tag Creation and Tagging with Strong Consistency

2. Product Discovery by Tags

3. Tag Usage Analytics

🔧 Solution 1: Tag Creation with Strong Consistency

Problem

Solution: Deterministic UUIDs + Database Constraints + Regional Caching

🔧 Solution 2: Product Discovery with Smart Caching

Problem

Solution: Multi-Layer Caching + Search Index + Progressive Loading

🔧 Solution 3: Tag Usage Counting with Zero Data Loss

Problem

Solution: Regional Buffers + Kafka Streaming + Atomic Batch Processing

🔄 Data Flow & Consistency Model

Write Flow (Tag Creation → Content Tagging → Analytics)

Cache Miss Handling

Cross-Region Analytics Aggregation

Cache Invalidation Strategy

Elasticsearch Synchronization

Consistency Guarantees

🔍 Identify Bottlenecks

1. Deterministic UUID Race Conditions

2. Redis Cache Stampede

3. Elasticsearch Index Lag

4. Hot Tag Counter Contention

5. Cross-Region Cache Cold Start

6. Kafka Consumer Lag During Spikes

📈 Performance Characteristics

Latency Breakdown

Scalability Numbers

🛡️ Resilience & Failure Handling

Failure Taxonomy

Retry Strategy

Redis Failure

Kafka Failure

Spanner Regional Outage

🚀 Deployment Strategy

📡 Observability

🔐 Security

🎯 Key Design Decisions & Trade-offs

🏁 Conclusion

Explore the Technologies

Core Technologies

Infrastructure Patterns

Related Posts

System Design: Job Scheduler (10M Jobs/day, DAG Dependencies, Effectively-Once Execution)

System Design: LeetCode (Code Sandbox, Container Isolation, Real-Time Contests)

System Design: URL Shortener (10B Short URLs, 100K Redirects/sec)

Explore 30+ topics in System Design Interview Prep→

Designing a Multi-Tenant, Globally Scalable Tagging System

🎯 Functional Requirements

⚡ Non-Functional Requirements

📊 Back-of-the-Envelope Calculations

User Base Assumptions

Tag Operations (Control Plane)

View Events (Data Plane)

API Request Breakdown

Total System Capacity

Storage Requirements Growth (Per Year)