Designing a Multi-Tenant, Globally Scalable Tagging System
Goal: Build a multi-tenant tagging platform that prevents duplicate tags globally, serves product discovery in under 25ms, and tracks tag usage across millions of clicks with zero data loss. Scale: 10K tenants, 200K concurrent users across three regions.
Tagging systems get complex fast when they need to work across multiple tenants and regions. This post breaks down the design into three consistency tiers, because not everything needs the same guarantees: strong consistency for tag operations, smart caching for discovery, and eventual consistency for analytics.
🎯 Functional Requirements
| Requirement | Details |
|---|---|
| Multi-tenancy | Each tenant independently manages its tags and content |
| Global availability | Low latency access for users worldwide |
| Tag lifecycle | Create, update, delete tags |
| Content tagging | Associate and dissociate tags from content |
| Popular tags dashboard | Show most-used tags by lifetime view counts |
| Scale | Handle up to 10K concurrent global users |
⚡ Non-Functional Requirements
| Requirement | Target | Priority |
|---|---|---|
| Users distributed across multiple regions | US, EU, Asia | P0 |
| Tag creation latency for all users | ~100ms regardless of location | P0 |
| Prevent duplicate tag creation globally | Same tag name = same tag ID | P0 |
| All users see same tag state after creation | Strong consistency | P0 |
| Tag creation and content-tag mapping consistency | ACID across regions | P0 |
| View count metrics consistency | Eventual (~2 min delay acceptable) | P1 |
| Resilience to infrastructure failures | Auto-failover, zero data loss | P1 |
📊 Back-of-the-Envelope Calculations
User Base Assumptions
- Total tenants: 10,000
- Average users per tenant: 200
- Total user base: 2,000,000 users
- Concurrent users at peak: 200,000 (10% of total users)
- Geographic distribution: 40% Americas, 30% Europe, 30% Asia-Pacific
Tag Operations (Control Plane)
-
Tag Creation:
- Average tags created per tenant per day: 20
- Total daily tag creations: 10,000 tenants × 20 = 200,000
- Peak tag creation rate: 200,000 ÷ 8 business hours ÷ 3600 seconds × 5 peak factor = ~35 tags/second
- Average request size: 1 KB
- Tag create bandwidth: 35 KB/second
-
Item-Tag Mappings:
- Average mappings per tenant per day: 2,000
- Total daily mappings: 10,000 tenants × 2,000 = 20,000,000
- Peak mapping rate: 20,000,000 ÷ 24 hours ÷ 3600 seconds × 3 peak factor = ~700 mappings/second
- Average request size: 0.8 KB
- Item-tag mapping bandwidth: ~560 KB/second
View Events (Data Plane)
- View Events Per User:
- Average views per active user per day: 100
- Total daily users: 10,000 tenants × 200 users = 2,000,000
- Total daily views: 2,000,000 users × 100 views = 200,000,000
- Peak view rate: 200,000,000 ÷ 24 hours ÷ 3600 seconds × 4 peak factor = ~9,300 views/second
- Average event size: 0.5 KB
- View event bandwidth: ~4,650 KB/second
API Request Breakdown
-
Read Operations (90% of traffic):
- Tag lookups: ~4,000/second (~400 KB/second)
- Item lookups: ~6,000/second (~1,200 KB/second)
- Popular tag queries: ~1,000/second (~300 KB/second)
- Tagged item lookups: ~2,000/second (~600 KB/second)
-
Write Operations (10% of traffic):
- Tag creations: ~35/second (~35 KB/second)
- Item-tag mappings: ~700/second (~560 KB/second)
- View events: ~9,300/second (~4,650 KB/second)
Total System Capacity
- Peak Inbound Bandwidth: ~7,700 KB/second (≈ 62 Mbps)
- Peak Outbound Bandwidth: ~23,200 KB/second (≈ 186 Mbps)
- Daily Data Transfer: ~2.5 TB
Storage Requirements Growth (Per Year)
Average Record Sizes:
- Tag record: ~350 bytes (including all metadata, timestamps, UUIDs)
- Item-Tag mapping: ~260 bytes (relationship with metadata)
- View event: ~145 bytes (analytics data)
Annual Storage Growth:
- Tags: 200,000 tags/day × 365 days × 350 bytes = ~25.6 GB/year
- Item-Tag Mappings: 20M mappings/day × 365 days × 260 bytes = ~1.9 TB/year
- View Events: 200M views/day × 365 days × 145 bytes = ~10.6 TB/year
- Total Annual Storage Growth: ~12.5 TB/year
🗄️ Database Design
Store Selection Rationale
| Store | Role | Why This Choice |
|---|---|---|
| Spanner | Primary DB, source of truth | Global strong consistency via TrueTime, multi-master writes, no manual sharding. Only option that gives ACID across 3 regions without operational overhead. |
| Redis | Regional cache + click buffer | Sub-ms reads for 95% cache-hit rate. INCR for atomic click counting. Disposable: Spanner is the fallback. |
| Kafka | Regional event streaming | Decouples tag operations from ES indexing and analytics. Durable buffer if downstream is slow. Regional topics avoid cross-region latency. |
| Elasticsearch | Tag search + aggregation | Handles "find all items with tag X" across millions of records in ~18ms. Complex filters (date range, type). Regional clusters keep latency local. |
Alternative Database Comparison
| Database | Global Consistency | Multi-Region | Multi-Master Writes | Scalability | Operational Overhead | Decision |
|---|---|---|---|---|---|---|
| PostgreSQL | ❌ Limited | ❌ Complex setup | ❌ Single master | ⚠️ Vertical scaling | 🔴 High | Rejected |
| MongoDB | ❌ Eventual only | ⚠️ Manual setup | ⚠️ Limited (sharding) | ✅ Good | ⚠️ Medium | Rejected |
| DynamoDB | ⚠️ Eventually consistent | ✅ Native | ✅ Global tables | ✅ Excellent | ✅ Low | Considered |
| CockroachDB | ✅ Strong | ✅ Native | ✅ Multi-region writes | ✅ Good | ⚠️ Medium | Considered |
| Spanner | ✅ Strong + External | ✅ Native | ✅ Multi-region writes | ✅ Excellent | ✅ Minimal | Selected |
Strong consistency knocked out the eventually consistent options. The multi-master requirement knocked out single-master databases. Worth noting: CockroachDB is a solid open-source alternative if GCP lock-in worries you.
Schema
📚 RESTful API Design
Product API
| HTTP Method | Endpoint | Description | Status Codes |
|---|---|---|---|
| GET | /v1/products | List all products (paginated) | 200 |
| GET | /v1/products/{productId} | Get a specific product | 200, 404 |
| POST | /v1/products | Create a new product | 201, 400, 409 |
| PUT | /v1/products/{productId} | Update a product | 200, 400, 404 |
| DELETE | /v1/products/{productId} | Delete a product | 204, 404 |
Item Type API
| HTTP Method | Endpoint | Description | Status Codes |
|---|---|---|---|
| GET | /v1/products/{productId}/item-types | List item types for a product | 200, 404 |
| GET | /v1/item-types/{itemTypeId} | Get a specific item type | 200, 404 |
| POST | /v1/products/{productId}/item-types | Create a new item type | 201, 400, 409 |
| PUT | /v1/item-types/{itemTypeId} | Update an item type | 200, 400, 404 |
| DELETE | /v1/item-types/{itemTypeId} | Delete an item type | 204, 404 |
Item API
| HTTP Method | Endpoint | Description | Status Codes |
|---|---|---|---|
| GET | /v1/products/{productId}/items | List items for a product | 200, 404 |
| GET | /v1/items/{itemId} | Get a specific item | 200, 404 |
| POST | /v1/products/{productId}/items | Create a new item | 201, 400, 409 |
| PUT | /v1/items/{itemId} | Update an item | 200, 400, 404 |
| DELETE | /v1/items/{itemId} | Delete an item | 204, 404 |
Tag API
| HTTP Method | Endpoint | Description | Status Codes |
|---|---|---|---|
| GET | /v1/tags | List all tags (paginated) | 200 |
| GET | /v1/tags/{tagId} | Get a specific tag | 200, 404 |
| POST | /v1/tags | Create a new tag | 201, 400, 409 |
| PUT | /v1/tags/{tagId} | Update a tag | 200, 400, 404 |
| DELETE | /v1/tags/{tagId} | Delete a tag | 204, 404 |
| GET | /v1/tags/popular | Get most popular tags (by views) | 200 |
Item Tagging API
| HTTP Method | Endpoint | Description | Status Codes |
|---|---|---|---|
| GET | /v1/items/{itemId}/tags | List tags for an item | 200, 404 |
| POST | /v1/items/{itemId}/tags | Add one or more tags to an item | 201, 400, 404 |
| DELETE | /v1/items/{itemId}/tags/{tagId} | Remove a tag from an item | 204, 404 |
| GET | /v1/tags/{tagId}/items | List items with a specific tag | 200, 404 |
| POST | /v1/tags/{tagId}/view | Record a tag view event | 200, 400 |
🏗️ System Architecture

Three regions, each running its own stack:
- Three regional deployments (EU, US, Asia) for low latency
- Centralized Spanner database for strong consistency
- Regional Redis clusters for caching and buffering
- Regional Kafka clusters for local event streaming and processing
- Regional Elasticsearch clusters for fast search and aggregation queries
- Regional processors for batch operations and cache management
🚧 Core Challenges
Three hard problems show up the moment you try to build this globally:
1. Tag Creation and Tagging with Strong Consistency
- Challenge: No duplicate tag creation or tagging across global regions
- Requirement: Strong consistency with ~100ms latency globally
- Complexity: Race conditions when multiple users create the same tag simultaneously
2. Product Discovery by Tags
- Challenge: Clicking a tag should show all products attached to it
- Requirement: Sub-25ms response time with complete results
- Complexity: Data distributed across regions, cache consistency, search index synchronization
3. Tag Usage Analytics
- Challenge: Tag clicks increment usage counts for popular tag rankings
- Requirement: Handle millions of clicks with eventual consistency
- Complexity: Hot key contention, cross-region aggregation
🔧 Solution 1: Tag Creation with Strong Consistency
Problem
Multiple users in different regions creating the same tag simultaneously could result in duplicate tags with different IDs, breaking content discovery.
Solution: Deterministic UUIDs + Database Constraints + Regional Caching
Architecture Pattern: Optimistic concurrency with deterministic conflict resolution
Step-by-step:
- Generate deterministic UUID from
hash(tenant_id + normalized_name)(~1ms) - Check regional Redis cache (~2ms). On cache hit, return immediately.
- Insert into Spanner with unique constraint (~90ms). If constraint violation, fetch existing tag.
- Update regional cache (~5ms)
- Publish to regional Kafka for ES indexing and downstream consumers (~2ms)
Other regions pick up the new tag on their next cache miss. They query Spanner directly, and since Spanner replicates globally, the data is already there.
Benefits:
- ✅ Same tag name always generates the same UUID across regions
- ✅ Database unique constraint prevents duplicates atomically
- ✅ No global coordination needed, conflicts resolve naturally
- ✅ Consistent ~100ms latency regardless of user location
🔧 Solution 2: Product Discovery with Smart Caching
Problem
Users clicking on tags need to see all products tagged with that tag from all regions with sub-25ms response time.
Solution: Multi-Layer Caching + Search Index + Progressive Loading
Architecture Pattern: Tiered caching with search index optimization
Why Elasticsearch?
- Fast aggregation queries across millions of records
- Complex search patterns like "items with tag X created in last 30 days"
- Regional deployment keeps latency low
- Real-time indexing: new tag-item mappings indexed within 30 seconds
- Cross-region replication every 2-3 minutes
Elasticsearch Document Structure:
Flow:
London user clicks "marketing" tag:
1. Get tag info from Redis cache (~2ms):
GET "tag:tenant123:marketing" → {tag_id: "abc-123"}
2. Check cached product list (~3ms):
GET "tag_items:tenant123:abc-123" → Cache hit
3. Return products immediately (~15ms total)
Cache Miss Flow:
1. Get tag info from cache (~2ms)
2. Query Elasticsearch index (~18ms):
ES query: {"query": {"term": {"tag_ids": "abc-123"}}}
3. Cache results (~3ms)
4. Return to user (~23ms total)
Elasticsearch Unavailable Flow:
1. Get tag info from cache (~2ms)
2. Fallback to Spanner database (~40ms):
SELECT items.* FROM items
JOIN item_tags ON items.id = item_tags.item_id
WHERE item_tags.tag_id = 'abc-123'
3. Cache results (~3ms)
4. Return to user (~45ms total)
Search Index Strategy:
- Pre-computed aggregations for popular tag-item combinations
- Kafka events trigger ES index updates
- Regional deployment across US, EU, Asia
- Spanner as authoritative fallback when ES is unavailable
Performance Tiers:
- Popular tags (95% of queries): ~15ms (Redis cache hit)
- Active tags (4% of queries): ~23ms (Elasticsearch query)
- Rare tags (1% of queries): ~45ms (Spanner database fallback)
🔧 Solution 3: Tag Usage Counting with Zero Data Loss
Problem
Tag clicks need to increment usage counters for popular tag rankings while handling millions of clicks without data loss or hot key contention.
Solution: Regional Buffers + Kafka Streaming + Atomic Batch Processing
Architecture Pattern: Regional buffering with eventual global consistency
Step-by-step:
- User click hits regional API, Redis INCR returns immediately (~2ms)
- Async event published to local Kafka
- Every 30 seconds, batch processor atomically reads and resets the counter (GETSET), writes a backup, flushes to Spanner, then cleans up the backup
- Every 2 minutes, global aggregator sums all regional events and updates lifetime stats
Zero Data Loss Mechanism:
- GETSET atomically reads and resets the counter
- Backup saved before database write
- Backup deleted only after successful Spanner commit
- On failure, backup value gets added back to the live counter
🔄 Data Flow & Consistency Model
Write Flow (Tag Creation → Content Tagging → Analytics)
1. Tag Creation:
User Request → Spanner (authoritative) → Redis Cache → Kafka Event
2. Content Tagging:
User Request → Spanner (item_tags) → Redis Cache → Elasticsearch Index
3. Tag Click Analytics:
User Request → Redis Buffer → Kafka → Spanner (batch) → Cache Update
Cache Miss Handling
Cache Miss Flow:
1. Check Redis cache → Miss
2. Query Spanner database (authoritative source)
3. Populate cache with TTL (30 minutes for relationships, 1 hour for tags)
4. Return data to user
5. Warm related cache keys in background
Popular Tags Cache Miss:
1. Redis miss → Query Spanner tag_lifetime_stats
2. Aggregate with live regional counters from all regions
3. Cache sorted results with 5-minute TTL
4. Background job pre-warms popular tag caches
Cross-Region Analytics Aggregation
Global View Count Collection (every 2 minutes):
1. US Regional Processor (designated global aggregator):
- Queries Spanner for unprocessed tag_increment_events
- Groups by (tenant_id, tag_id): {tenant123_tag789: us=156, eu=89, asia=203}
2. Atomic Update:
- UPDATE tag_lifetime_stats SET total_lifetime_views += 448
- UPDATE tag_increment_events SET processed_globally=true
3. No Cross-Region Cache Invalidation:
- Other regions discover updates through cache miss → Spanner query
- Each region manages its own cache independently
- Spanner serves as the coordination layer, not Kafka
Cache Invalidation Strategy
Immediate Invalidation (Strong Consistency Operations):
1. Tag Creation in Tokyo:
- Spanner write succeeds → Update Tokyo Redis cache
- Publish to Tokyo Kafka for local ES indexing
2. User in London queries same tag:
- London Redis cache miss
- London queries Spanner directly (data already globally replicated)
- London populates its local cache
- No cross-region coordination needed
Background Invalidation (Analytics):
1. Batch processing completes → Global stats updated
2. Kafka broadcast: {"invalidate": ["popular:*"]}
3. Regional processors clear popular tag caches
4. Next request rebuilds from fresh Spanner data
Elasticsearch Synchronization
Regional Index Updates:
1. Content tagged in Tokyo → Tokyo Kafka event
2. Tokyo ES consumer processes event within 30 seconds
Cross-Region ES Discovery:
1. London user searches → London ES query
2. If data not yet replicated, fallback to Spanner
3. Background job syncs ES data from Spanner every 5 minutes
ES Recovery:
1. ES cluster failure → Circuit breaker activates
2. Queries fall back to Spanner globally
3. Background job rebuilds ES index from Spanner
4. Circuit breaker reopens when ES passes health check
Consistency Guarantees
- Strong consistency: Tag creation, content tagging (via Spanner)
- Eventual consistency: Search indexes, analytics, cross-region caches
- Cache coherence: Invalidation events ensure stale data cleanup
🔍 Identify Bottlenecks
1. Deterministic UUID Race Conditions
Within a single tenant, simultaneous creation of the same tag from two regions hits the UNIQUE constraint. The second insert gets a constraint violation, fetches the existing tag, and returns it. No coordination needed, but the losing request pays ~85ms for the extra round-trip.
2. Redis Cache Stampede
If a regional Redis cluster fails, all tag lookups fall through to Spanner simultaneously. Circuit breakers limit concurrent Spanner queries and serve from a local in-memory cache of recently accessed tags while Redis recovers.
3. Elasticsearch Index Lag
New tag-item mappings take up to 30 seconds to appear in the search index. Users who tag content and immediately search may not see their items. The fallback path queries Spanner directly when results look incomplete.
4. Hot Tag Counter Contention
A viral tag could receive thousands of clicks per second in one region. Redis INCR handles this fine, but the 30-second batch flush means global totals lag behind. That's the trade-off we accepted when we chose eventual consistency for analytics.
5. Cross-Region Cache Cold Start
When a tag created in Tokyo gets accessed in London for the first time, the London cache miss adds ~40ms for the Spanner round-trip. Popular tags warm quickly through organic traffic. Long-tail tags always pay this cost on first access per region.
6. Kafka Consumer Lag During Spikes
If Elasticsearch indexing slows down, Kafka consumer lag grows. That's fine. Kafka holds onto the events, consumers catch up once ES recovers. You lose search freshness for a bit, not data.
📈 Performance Characteristics
Latency Breakdown
| Operation | Cache Hit | Cache Miss | Notes |
|---|---|---|---|
| Tag Creation (new) | N/A | ~100ms | Strong consistency globally |
| Tag Creation (duplicate) | ~85ms | ~85ms | Returns existing tag |
| Content Tagging | ~40ms | ~75ms | Strong consistency |
| Product Discovery | ~15ms | ~45ms | 95% cache hit rate |
| Tag Click Response | ~2ms | ~2ms | Immediate user feedback |
| Popular Dashboard | ~5ms | ~150ms | Real-time rankings |
Scalability Numbers
Per Region Capacity:
- Tag creations: ~1,000/second
- Content tagging: ~10,000/second
- Tag clicks: ~100,000/second
- Product queries: ~50,000/second
- Dashboard requests: ~10,000/second
Global Capacity (3 regions):
- Total: 3x regional capacity with linear scaling
- Cross-region consistency: ~2-3 minutes maximum
🛡️ Resilience & Failure Handling
Failure Taxonomy
| Component | Failure Mode | Detection | Recovery | Data Loss Risk |
|---|---|---|---|---|
| Redis (regional) | Node crash, network split | Health check (5s) | Circuit breaker, fall back to Spanner | None (Spanner is source of truth) |
| Kafka (regional) | Broker failure | Consumer lag spike | Partition rebalance, in-memory buffer | None (RF=3, acks=all) |
| Spanner (regional) | AZ outage | Automatic detection | Failover to healthy regions | None (synchronous replication) |
| Elasticsearch | Cluster degradation | Latency threshold | Circuit breaker, fall back to Spanner | None (ES is derived state) |
Retry Strategy
- Tag creation: No retry needed. Deterministic UUIDs mean a repeated attempt produces the same result.
- Click counting: Redis backup key preserves count if the batch flush to Spanner fails. Recovery adds the backup to the live counter.
- ES indexing: Kafka retains events. Failed indexing retries from the consumer offset.
Redis Failure
1. Circuit breaker activates
2. Fall back to direct Spanner queries (+~60ms latency)
3. Queue cache warming tasks
4. Resume normal operation when Redis recovers
Kafka Failure
1. Regional processing continues with local buffering
2. Queue failed event publishes in memory
3. Replay queued events when Kafka recovers
4. No data loss due to local backup mechanisms
Spanner Regional Outage
1. Automatic failover to healthy regions
2. Writes continue from operational regions
3. Read traffic redirected to available replicas
4. Global consistency maintained
🚀 Deployment Strategy
Multi-Region Kubernetes:
- Three regional clusters (US, EU, Asia) behind global DNS with latency-based routing
- Each region runs: API pods, batch processors, Redis cluster, Kafka cluster, ES cluster
- Spanner spans all regions as a single global instance
Release Process:
- Rolling deployments with readiness probes. No traffic until the pod is healthy.
- Canary releases: 5% traffic for 15 minutes, automated rollback on error rate spike
- Blue-green for database migrations with backward-compatible schema changes
Technology Stack:
| Component | Technology |
|---|---|
| Orchestration | Kubernetes (GKE) |
| API Gateway | Envoy + gRPC |
| Monitoring | Prometheus + Grafana |
| CI/CD | ArgoCD + GitHub Actions |
| Load Balancer | Cloud L4 LB + Envoy sidecar (L7) |
📡 Observability
Key Metrics:
Tags: tag.created.rate, tag.duplicate.rate, creation.latency.p99
Discovery: discovery.cache_hit.rate, es.query.latency.p50/p99
Analytics: click.ingested.rate, batch.flush.rate, aggregation.lag.seconds
Cache: redis.hit_rate, redis.eviction.rate (per region)
Kafka: consumer.lag (per topic), produce.error.rate
Critical Alerts:
Tracing: Every tag operation gets a trace ID that follows it through the whole pipeline. Sample 1% of normal traffic, 100% of errors and anything slow. Jaeger or Grafana Tempo on the backend.
🔐 Security
Authentication: Tenant API keys plus JWT tokens with a 15-minute TTL. Every endpoint requires a valid tenant context. Cross-tenant access can't happen because tenant_id is baked into every query and constraint at the database level.
Encryption: TLS 1.3 on all external traffic. mTLS between services. Spanner handles encryption at rest out of the box. Redis runs with TLS for inter-node traffic.
Tenant Isolation: Every row, cache key, and Kafka message carries a tenant_id. The API layer enforces tenant_id filtering on all Spanner queries. Per-tenant rate limits keep noisy neighbors from ruining it for everyone else.
🎯 Key Design Decisions & Trade-offs
| Aspect | Decision | Alternative | Rationale |
|---|---|---|---|
| Tag Consistency | Strong consistency via Spanner | Eventual consistency | Business requirement for duplicate prevention |
| View Counting | Eventual consistency via buffering | Strong consistency | Performance over accuracy for analytics |
| Caching Strategy | Multi-tier with invalidation | Write-through only | Optimizes for read-heavy workload |
| Conflict Resolution | Deterministic UUIDs | Global locking | Eliminates coordination overhead |
| Regional Strategy | Regional processing + global DB | Full replication | Balances performance with consistency |
🏁 Conclusion
Here's what the three solutions actually buy us:
Tag operations land at ~104ms globally with zero duplicates. Deterministic UUIDs plus database constraints do the heavy lifting. No coordination protocol needed.
Product discovery hits ~18ms for 95% of queries through multi-tier caching. When caches miss, Elasticsearch picks it up. When ES is down, Spanner is always there.
Analytics responds in ~2ms to the user while batching counts in the background. The atomic backup mechanism means you don't lose clicks even when things go wrong.
The big takeaway: don't use one consistency model for everything. Tag creation needs strong consistency. Discovery is a caching problem. Analytics can be eventual. Match the guarantee to the actual requirement, and you avoid paying for consistency you don't need.
Explore the Technologies
Want to go deeper on any of these?
Core Technologies
| Technology | Role in This System | Learn More |
|---|---|---|
| Google Cloud Spanner | Primary database: global strong consistency, multi-region writes, interleaved tables for tag-content relationships | Spanner Deep Dive |
| Redis | Multi-tier caching: regional click buffering, tag lookup cache, sub-ms reads for 95% of queries | Redis Deep Dive |
| Kafka | Event streaming: tag operation events, async click analytics pipeline, cross-service decoupling | Kafka Deep Dive |
| Elasticsearch | Full-text tag search: popular tags aggregation, typeahead suggestions, discovery queries | Elasticsearch Deep Dive |
Infrastructure Patterns
| Pattern | Role in This System | Learn More |
|---|---|---|
| Caching Strategies | Write-through cache with invalidation, multi-tier regional caches, cache warming | Caching Strategies |
| Database Sharding | Spanner's split-based auto-sharding for global data distribution | Database Sharding |
| Replication & Consistency | Strong consistency for tag writes, eventual consistency for analytics counters | Replication & Consistency |
| Message Queues & Event Streaming | Kafka topic design for tag events and click analytics pipelines | Event Streaming |
Variations of this architecture run in production at multi-tenant SaaS platforms, content management systems, and social platforms that tag content at global scale.
#systemdesign #distributed-systems #saas #multitenancy #microservices #redis #architecture #kafka #scalability