Cassandra — Distributed wide-column store for massive write throughput
Category: Databases
Apache Cassandra is a distributed, wide-column NoSQL database designed to handle large amounts of data across many commodity servers with no single point of failure. Originally developed at Facebook for inbox search, it combines the distributed design of Amazon's Dynamo with the data model of Google's Bigtable.
Use Cases for Cassandra
- Time-series data at scale
- IoT sensor data ingestion
- Messaging and chat history
- User activity tracking
- Product catalogs
- Write-heavy workloads
Pros of Cassandra
- Linear horizontal scalability
- No single point of failure (peer-to-peer)
- Tunable consistency levels per query
- Optimized for high write throughput
- Multi-datacenter replication built-in
Cons of Cassandra
- Limited query flexibility (no joins, no aggregations)
- Data modeling driven by query patterns
- Eventual consistency by default
- Operational complexity (compaction, tombstones, repairs)
- Read performance depends heavily on data model
When to Use Cassandra
- Need to handle millions of writes per second
- Data is naturally partitioned (time-series, per-user)
- Require multi-region active-active deployment
- Availability matters more than strong consistency
When Not to Use Cassandra
- Need complex queries with joins
- Dataset is small (< 10 GB)
- Require strong consistency for every read
- Ad-hoc querying is a primary use case
Alternatives to Cassandra
DynamoDB, MongoDB, PostgreSQL
ClickHouse — Column-oriented OLAP database for blazing-fast analytics
Category: Search & Analytics
ClickHouse is an open-source, column-oriented database management system designed for online analytical processing (OLAP). Developed at Yandex, it can process analytical queries over billions of rows in real-time, making it one of the fastest analytics databases available.
Use Cases for ClickHouse
- Real-time analytics dashboards
- Ad tech and clickstream analysis
- Time-series data at scale
- Business intelligence queries
- Log analytics (alternative to Elasticsearch)
- A/B test result analysis
Pros of ClickHouse
- Extremely fast analytical queries (columnar storage)
- Excellent compression ratios
- Handles billions of rows with sub-second queries
- SQL-compatible query interface
- Supports materialized views and real-time aggregation
Cons of ClickHouse
- Not designed for OLTP or point lookups
- No full ACID transactions
- Updates and deletes are expensive (async mutations)
- Requires careful schema design for optimal performance
- Smaller ecosystem than PostgreSQL or Elasticsearch
When to Use ClickHouse
- Need sub-second queries over billions of rows
- Analytical/OLAP workloads with aggregations
- Real-time dashboards and reporting
- Want cost-effective alternative to managed analytics services
When Not to Use ClickHouse
- OLTP workloads with frequent updates/deletes
- Need full transaction support
- Small datasets that fit in PostgreSQL
- Full-text search (use Elasticsearch)
Alternatives to ClickHouse
Elasticsearch, PostgreSQL, Spark
CockroachDB — Distributed SQL database surviving any failure
Category: Databases
CockroachDB is a distributed SQL database designed to be resilient, scalable, and globally consistent. Inspired by Google's Spanner, it provides full ACID transactions across a distributed cluster while using a PostgreSQL-compatible SQL interface, making migration from PostgreSQL relatively straightforward.
Use Cases for CockroachDB
- Multi-region transactional applications
- Financial systems requiring strong consistency
- Global SaaS platforms
- Replacing sharded PostgreSQL
- Disaster recovery without manual failover
Pros of CockroachDB
- Distributed ACID transactions
- Automatic horizontal scaling and rebalancing
- PostgreSQL-compatible wire protocol
- Multi-region with locality-aware reads
- Survives node, rack, and datacenter failures
Cons of CockroachDB
- Higher write latency due to consensus protocol
- Not fully PostgreSQL-compatible (some extensions missing)
- Operational learning curve for distributed SQL
- More expensive than single-node PostgreSQL for small datasets
When to Use CockroachDB
- Need horizontal scaling with SQL and ACID
- Multi-region deployment with strong consistency
- Want automatic failover without manual intervention
- Outgrowing a single PostgreSQL instance
When Not to Use CockroachDB
- Single-region with low data volume
- Ultra-low latency requirements (single-node PG is faster)
- Need full PostgreSQL extension ecosystem
- Budget-constrained small projects
Alternatives to CockroachDB
PostgreSQL, Cassandra, DynamoDB
DynamoDB — Fully managed NoSQL database with single-digit millisecond latency
Category: Databases
Amazon DynamoDB is a fully managed NoSQL key-value and document database that delivers single-digit millisecond performance at any scale. It was designed based on the principles from Amazon's Dynamo paper and has been a core building block of AWS services since 2012.
Use Cases for DynamoDB
- Serverless application backends
- Gaming leaderboards and player state
- Shopping carts and user preferences
- IoT data ingestion
- Session management
- Ad tech and real-time bidding
Pros of DynamoDB
- Fully managed — zero operational overhead
- Single-digit millisecond latency at any scale
- Automatic scaling with on-demand capacity
- Built-in DAX caching layer
- Global Tables for multi-region replication
Cons of DynamoDB
- Vendor lock-in to AWS
- Expensive at large scale compared to self-managed
- 25 GSI limit per table
- Item size limited to 400 KB
- Complex pricing model (RCU/WCU)
When to Use DynamoDB
- Building on AWS and want zero ops overhead
- Need predictable single-digit millisecond performance
- Access patterns are known and well-defined
- Serverless architectures with Lambda
When Not to Use DynamoDB
- Need complex relational queries or joins
- Want multi-cloud or vendor-neutral solution
- Data model is highly relational
- Need full-text search capabilities
Alternatives to DynamoDB
Cassandra, MongoDB, Redis
Elasticsearch — Distributed search and analytics engine built on Apache Lucene
Category: Search & Analytics
Elasticsearch is a distributed, RESTful search and analytics engine built on top of Apache Lucene. It stores JSON documents and makes them searchable in near real-time, excelling at full-text search, log analytics, and complex aggregations across large datasets.
Use Cases for Elasticsearch
- Full-text search across large datasets
- Log and event data analysis (ELK stack)
- E-commerce product search
- Application performance monitoring
- Security analytics (SIEM)
- Autocomplete and suggestions
Pros of Elasticsearch
- Near real-time full-text search
- Horizontally scalable with automatic sharding
- Rich query DSL with aggregations
- Schema-free JSON documents
- Powerful text analysis and tokenization
Cons of Elasticsearch
- Not a primary data store (no ACID transactions)
- High memory consumption for indexing
- Split-brain risk without careful cluster config
- Complex capacity planning and tuning
- License changes (SSPL) may affect deployment
When to Use Elasticsearch
- Need full-text search with relevance scoring
- Log aggregation and analytics (ELK/EFK stack)
- Real-time search across millions of documents
- Complex aggregations and faceted search
When Not to Use Elasticsearch
- Primary data store for transactional workloads
- Simple key-value lookups
- Strong consistency requirements
- Limited infrastructure budget (resource-hungry)
Alternatives to Elasticsearch
ClickHouse, PostgreSQL, MongoDB
etcd — Distributed key-value store for shared configuration and service discovery
Category: Coordination
etcd is a distributed, reliable key-value store designed for the most critical data of a distributed system. Written in Go, it uses the Raft consensus algorithm to ensure strong consistency across all nodes in a cluster, making it the backbone of Kubernetes and many other cloud-native systems.
Use Cases for etcd
- Kubernetes cluster state storage
- Service discovery
- Distributed configuration
- Leader election
- Feature flag management
- Distributed locking
Pros of etcd
- Strong consistency via Raft consensus
- Simple HTTP/gRPC API
- Watch API for real-time change notifications
- Lease-based TTL for ephemeral keys
- Foundation of Kubernetes control plane
Cons of etcd
- Not designed for large data volumes (recommended < 8GB)
- Write latency depends on cluster size and network
- All data must fit in memory
- Limited query capabilities (prefix-based only)
- Compaction needed to prevent unbounded growth
When to Use etcd
- Kubernetes or cloud-native infrastructure
- Need strongly consistent configuration store
- Service discovery with health checking
- Distributed coordination in Go-based systems
When Not to Use etcd
- General-purpose data storage
- Large datasets (> 8 GB)
- High write throughput requirements
- Complex query patterns
Alternatives to etcd
ZooKeeper, Kafka
Flink — Stateful stream processing framework for real-time computation
Category: Stream Processing
Apache Flink is a distributed stream processing framework designed for stateful computations over unbounded data streams. Unlike micro-batch systems, Flink processes events one at a time with true streaming semantics, achieving low latency while maintaining exactly-once state consistency.
Use Cases for Flink
- Real-time fraud detection
- Event-driven applications
- Real-time ETL pipelines
- Complex event processing (CEP)
- Real-time machine learning inference
- Continuous monitoring and alerting
Pros of Flink
- True event-time processing with watermarks
- Exactly-once state consistency
- Low-latency stream processing
- Sophisticated windowing (tumbling, sliding, session)
- Unified batch and stream processing
Cons of Flink
- Steep learning curve
- Complex cluster management
- Checkpointing can impact latency under load
- Smaller community than Spark
- Resource-intensive for stateful operations
When to Use Flink
- Need true real-time processing (low latency)
- Complex event patterns with event-time semantics
- Exactly-once processing guarantees required
- Stateful stream processing (aggregations, joins)
When Not to Use Flink
- Simple batch processing jobs
- Small data volumes that don't justify complexity
- Team lacks streaming expertise
- Ad-hoc analytical queries (use Spark or ClickHouse)
Alternatives to Flink
Spark, Kafka Streams, Kafka
Kafka Streams — Lightweight stream processing library built on Apache Kafka
Category: Stream Processing
Kafka Streams is a client library for building real-time, event-driven applications and microservices that process data stored in Apache Kafka. Unlike Flink or Spark, it doesn't require a separate processing cluster — it runs as a regular Java application, making deployment and scaling straightforward.
Use Cases for Kafka Streams
- Real-time data transformation
- Event-driven microservices
- Stream-table joins
- Real-time aggregations and windowing
- Data enrichment pipelines
- Lightweight ETL within Kafka
Pros of Kafka Streams
- No separate cluster needed — runs as a library
- Exactly-once processing semantics
- Elastic scaling via Kafka consumer groups
- Interactive queries on local state stores
- Simple deployment (just a JVM application)
Cons of Kafka Streams
- Tied to Kafka (input and output must be Kafka topics)
- JVM only (Java/Kotlin/Scala)
- Limited to Kafka's partitioning model
- State stores can grow large on disk
- Less feature-rich than Flink for complex processing
When to Use Kafka Streams
- Already using Kafka and need simple stream processing
- Want to avoid managing a separate processing cluster
- Building event-driven microservices
- Need exactly-once within the Kafka ecosystem
When Not to Use Kafka Streams
- Processing data from non-Kafka sources
- Need advanced CEP or event-time processing
- Python or non-JVM tech stack
- Complex multi-stream joins and windowing
Alternatives to Kafka Streams
Kafka, Flink, Spark
Kafka — Distributed event streaming platform for real-time data pipelines
Category: Messaging
Apache Kafka is a distributed event streaming platform capable of handling trillions of events per day. Originally developed at LinkedIn, Kafka is designed as a distributed commit log where messages are persisted to disk and replicated across brokers for fault tolerance.
Use Cases for Kafka
- Event-driven microservices
- Real-time data pipelines
- Log aggregation
- Change data capture (CDC)
- Stream processing
- Activity tracking and metrics
Pros of Kafka
- Extremely high throughput (millions of messages/sec)
- Durable message storage with configurable retention
- Horizontal scaling via partitions
- Strong ordering guarantee within partitions
- Rich ecosystem (Connect, Streams, Schema Registry)
Cons of Kafka
- Operational complexity (ZooKeeper/KRaft, brokers, topics)
- Not ideal for low-latency request-reply patterns
- Consumer group rebalancing can cause pauses
- No built-in message routing or filtering
- Steep learning curve
When to Use Kafka
- Need durable, ordered event streaming
- Building event-driven or CQRS architectures
- High-throughput log or data pipeline ingestion
- Decoupling producers and consumers at scale
When Not to Use Kafka
- Simple task queues with few messages
- Need complex message routing (use RabbitMQ)
- Request-reply messaging patterns
- Small team without operational expertise
Alternatives to Kafka
RabbitMQ, Kafka Streams, Flink
Kong — Cloud-native API gateway built on NGINX
Category: API Infrastructure
Kong is a cloud-native, open-source API gateway and microservice management layer built on top of NGINX. It provides a centralized entry point for all API traffic, handling cross-cutting concerns like authentication, rate limiting, logging, and traffic management through a plugin architecture.
Use Cases for Kong
- Centralized API gateway
- Authentication and authorization (OAuth, JWT, API keys)
- Rate limiting and throttling
- Request/response transformation
- API analytics and monitoring
- Service mesh with Kong Mesh
Pros of Kong
- Built on NGINX — inherits high performance
- Rich plugin ecosystem (100+ plugins)
- Supports declarative and database-backed configuration
- Kubernetes-native with Ingress Controller
- Open-source core with enterprise features
Cons of Kong
- Adds latency compared to direct NGINX
- Plugin ecosystem quality varies
- Enterprise features require paid license
- Configuration complexity at scale
- Database dependency (Postgres/Cassandra) for some modes
When to Use Kong
- Need a centralized API gateway with auth and rate limiting
- Managing multiple APIs/microservices behind one entry point
- Kubernetes environments needing an ingress controller
- Want plugin-based extensibility without custom code
When Not to Use Kong
- Simple reverse proxy needs (use NGINX directly)
- Ultra-low latency where every microsecond matters
- Tight budget constraints (enterprise features are paid)
- Simple static site serving
Alternatives to Kong
NGINX, Kafka
Memcached — High-performance distributed memory caching system
Category: Caching
Memcached is a free, open-source, high-performance distributed memory caching system designed to speed up dynamic web applications by reducing database load. It stores data as simple key-value pairs in RAM, using a hash table distributed across multiple machines.
Use Cases for Memcached
- Database query result caching
- HTML fragment caching
- Session storage
- API response caching
- Object caching for web apps
Pros of Memcached
- Simple and predictable — just key-value strings
- Multi-threaded architecture for high throughput
- Consistent hashing for easy horizontal scaling
- Minimal memory overhead per item
- Mature and battle-tested at massive scale
Cons of Memcached
- No persistence — data lost on restart
- Only supports string values (no rich data structures)
- No built-in replication
- No pub/sub or advanced features
- Limited to key-value operations
When to Use Memcached
- Simple caching of serialized objects or query results
- Need multi-threaded performance for high concurrency
- Want a lightweight, low-overhead cache layer
- Already have a durable primary store
When Not to Use Memcached
- Need data persistence or durability
- Require rich data structures (use Redis)
- Need pub/sub or stream processing
- Want built-in replication and failover
Alternatives to Memcached
Redis, NGINX
MongoDB — Document database for flexible schemas and rapid development
Category: Databases
MongoDB is a popular document-oriented NoSQL database that stores data in flexible, JSON-like BSON documents. Unlike relational databases, MongoDB doesn't require a predefined schema, allowing fields to vary from document to document and making it ideal for applications with evolving data models.
Use Cases for MongoDB
- Content management systems
- Product catalogs with varied attributes
- Mobile app backends
- Real-time analytics
- User profiles and personalization
- Prototyping and rapid iteration
Pros of MongoDB
- Flexible schema — no migrations needed
- Rich query language with aggregation pipeline
- Horizontal scaling via built-in sharding
- Native JSON document model
- Multi-document ACID transactions
Cons of MongoDB
- Joins are limited (no server-side joins pre-5.0, $lookup is costly)
- Memory-mapped storage can consume significant RAM
- Denormalization leads to data duplication
- Write amplification with large documents
- Sharding requires careful key selection
When to Use MongoDB
- Schema evolves frequently (startups, MVPs)
- Data is naturally document-shaped (JSON)
- Need flexible querying over semi-structured data
- Rapid prototyping with changing requirements
When Not to Use MongoDB
- Highly relational data with many joins
- Need strong multi-row transactions across collections
- Write-heavy append-only workloads (prefer Cassandra)
- Strict schema enforcement is required
Alternatives to MongoDB
PostgreSQL, DynamoDB, Elasticsearch
NGINX — High-performance web server, reverse proxy, and load balancer
Category: API Infrastructure
NGINX is an open-source, high-performance HTTP server and reverse proxy that powers a significant portion of the internet. Its event-driven, non-blocking architecture allows it to handle tens of thousands of simultaneous connections with minimal memory usage, making it the go-to choice for web serving and load balancing.
Use Cases for NGINX
- Reverse proxy and load balancing
- SSL/TLS termination
- Static file serving
- API gateway
- Rate limiting and access control
- HTTP caching
Pros of NGINX
- Extremely high concurrency (event-driven, non-blocking)
- Low memory footprint
- Battle-tested at massive scale
- Rich module ecosystem
- Supports HTTP, TCP, and UDP load balancing
Cons of NGINX
- Configuration can be complex for advanced use cases
- Dynamic reconfiguration requires reload
- Limited built-in API management features
- Free version lacks some enterprise features
- Lua scripting for advanced logic adds complexity
When to Use NGINX
- Need a reverse proxy in front of application servers
- SSL termination and HTTP/2 support
- Serving static files alongside dynamic content
- Simple load balancing without a service mesh
When Not to Use NGINX
- Need a full API gateway with auth, rate limiting, analytics
- Service mesh with dynamic service discovery
- Complex traffic routing requiring programmatic control
- GraphQL-specific gateway features
Alternatives to NGINX
Kong, Kafka
PostgreSQL — Advanced open-source relational database with extensibility
Category: Databases
PostgreSQL is the world's most advanced open-source relational database, known for its reliability, robustness, and rich feature set. It supports complex SQL queries, foreign keys, triggers, views, stored procedures, and a wide range of data types including JSON, arrays, and custom types.
Use Cases for PostgreSQL
- OLTP transactional workloads
- Complex queries with joins and aggregations
- Geospatial data (PostGIS)
- Full-text search
- JSON/JSONB document storage
- Time-series data (with TimescaleDB)
Pros of PostgreSQL
- Full ACID compliance with MVCC
- Extremely extensible (custom types, functions, extensions)
- Excellent SQL standard compliance
- Rich indexing (B-tree, GIN, GiST, BRIN)
- Strong community and ecosystem
Cons of PostgreSQL
- Vertical scaling has limits
- Write-heavy workloads can struggle without tuning
- Replication is asynchronous by default
- Sharding requires extensions like Citus
- VACUUM overhead for long-running transactions
When to Use PostgreSQL
- Need strong consistency and ACID transactions
- Complex relational data with joins
- Mixed workloads (relational + JSON + full-text search)
- Geospatial queries
When Not to Use PostgreSQL
- Need automatic horizontal sharding at massive scale
- Simple key-value access patterns only
- Ultra-low latency requirements (< 1ms)
- Append-only time-series at millions of events/sec
Alternatives to PostgreSQL
CockroachDB, MongoDB, Cassandra
RabbitMQ — Flexible message broker with advanced routing capabilities
Category: Messaging
RabbitMQ is a widely deployed open-source message broker that implements the Advanced Message Queuing Protocol (AMQP). It acts as a middleman for messaging between applications, providing sophisticated routing, reliable delivery, and flexible patterns for distributed systems.
Use Cases for RabbitMQ
- Task queues and background job processing
- Request-reply messaging patterns
- Complex message routing (topic, headers, fanout)
- Microservice communication
- Delayed and scheduled message delivery
- Priority queues
Pros of RabbitMQ
- Rich routing with exchanges (direct, topic, fanout, headers)
- Supports multiple protocols (AMQP, MQTT, STOMP)
- Message acknowledgment and dead-letter queues
- Priority queues and TTL support
- Easy to set up and operate for small-medium scale
Cons of RabbitMQ
- Lower throughput than Kafka for streaming workloads
- Messages are deleted after consumption (not replayable)
- Clustering can be complex at large scale
- Memory pressure under high message backlog
- Not designed for event sourcing or log-based systems
When to Use RabbitMQ
- Need flexible message routing patterns
- Task queues with acknowledgment and retries
- Request-reply or RPC patterns over messaging
- Small-to-medium scale with simpler operations
When Not to Use RabbitMQ
- Need event replay or long-term message storage
- Ultra-high throughput streaming (millions/sec)
- Event sourcing or CQRS architectures
- Need strong message ordering across partitions
Alternatives to RabbitMQ
Kafka, Redis
Redis — In-memory data structure store for caching and real-time workloads
Category: Caching
Redis is an open-source, in-memory key-value store known for its exceptional speed and versatile data structures. Unlike simple key-value caches, Redis supports strings, hashes, lists, sets, sorted sets, bitmaps, HyperLogLogs, and streams — making it far more than just a cache.
Use Cases for Redis
- Session storage
- Rate limiting
- Leaderboards & rankings
- Real-time analytics
- Pub/Sub messaging
- Distributed locks
- Caching hot data
Pros of Redis
- Sub-millisecond latency
- Rich data structures (lists, sets, sorted sets, hashes, streams)
- Built-in replication and Lua scripting
- Persistence options (RDB snapshots, AOF)
- Cluster mode for horizontal scaling
Cons of Redis
- RAM-bound — dataset must fit in memory
- Single-threaded command execution
- Cluster mode adds operational complexity
- No built-in query language for complex queries
When to Use Redis
- Need sub-millisecond reads/writes
- Caching hot data in front of a database
- Real-time counters, leaderboards, or rate limiters
- Session management across multiple app servers
When Not to Use Redis
- Dataset significantly exceeds available RAM
- Need full ACID transactions with joins
- Primary long-term storage for critical data
- Complex relational queries
Alternatives to Redis
Memcached, DynamoDB, PostgreSQL
Spark — Unified analytics engine for large-scale data processing
Category: Stream Processing
Apache Spark is a unified analytics engine for large-scale data processing, providing high-level APIs in Java, Scala, Python, and R. It was developed to overcome the limitations of Hadoop MapReduce by introducing in-memory computing, which makes iterative algorithms and interactive queries orders of magnitude faster.
Use Cases for Spark
- Large-scale batch ETL
- Machine learning pipelines (MLlib)
- Interactive SQL analytics (Spark SQL)
- Graph processing (GraphX)
- Micro-batch stream processing (Structured Streaming)
- Data lake processing
Pros of Spark
- Unified API for batch, streaming, SQL, and ML
- In-memory processing for iterative workloads
- Rich ecosystem and massive community
- Supports multiple languages (Scala, Python, Java, R, SQL)
- Catalyst optimizer for SQL query optimization
Cons of Spark
- Micro-batch streaming adds latency (seconds, not milliseconds)
- High memory consumption
- Complex tuning for optimal performance
- Not ideal for small datasets
- Shuffle operations can be expensive
When to Use Spark
- Large-scale batch data processing
- Need unified batch + streaming + ML platform
- Interactive SQL queries on large datasets
- Team is comfortable with JVM or Python ecosystem
When Not to Use Spark
- Need true sub-second stream processing (use Flink)
- Simple ETL that fits in a single machine
- Real-time event processing requiring low latency
- OLTP workloads
Alternatives to Spark
Flink, Kafka Streams, ClickHouse
ZooKeeper — Centralized coordination service for distributed systems
Category: Coordination
Apache ZooKeeper is a centralized service for maintaining configuration information, naming, distributed synchronization, and group services. It provides a simple set of primitives that distributed applications can build upon to implement higher-level coordination patterns like leader election, distributed locks, and barriers.
Use Cases for ZooKeeper
- Leader election
- Distributed configuration management
- Service discovery
- Distributed locking
- Cluster membership tracking
- Barrier synchronization
Pros of ZooKeeper
- Strong consistency via ZAB consensus protocol
- Ordered, sequential operations
- Ephemeral nodes for failure detection
- Watch mechanism for real-time notifications
- Battle-tested in production (Kafka, HBase, Solr)
Cons of ZooKeeper
- Not designed for large data storage
- Write throughput is limited (leader bottleneck)
- Operational complexity with quorum management
- Java-based — can have GC pause issues
- Being replaced by newer alternatives (etcd, KRaft)
When to Use ZooKeeper
- Need leader election for distributed services
- Existing Hadoop/Kafka ecosystem dependency
- Distributed configuration that must be consistent
- Service coordination requiring strong ordering
When Not to Use ZooKeeper
- General-purpose data storage
- High write throughput requirements
- Greenfield projects (consider etcd instead)
- Simple service discovery (consider Consul)
Alternatives to ZooKeeper
etcd, Kafka