Key Technologies for System Design — Redis, Kafka, PostgreSQL & More

Cassandra — Distributed wide-column store for massive write throughput

Category: Databases

Apache Cassandra is a distributed, wide-column NoSQL database designed to handle large amounts of data across many commodity servers with no single point of failure. Originally developed at Facebook for inbox search, it combines the distributed design of Amazon's Dynamo with the data model of Google's Bigtable.

Use Cases for Cassandra

Time-series data at scale
IoT sensor data ingestion
Messaging and chat history
User activity tracking
Product catalogs
Write-heavy workloads

Pros of Cassandra

Linear horizontal scalability
No single point of failure (peer-to-peer)
Tunable consistency levels per query
Optimized for high write throughput
Multi-datacenter replication built-in

Cons of Cassandra

Limited query flexibility (no joins, no aggregations)
Data modeling driven by query patterns
Eventual consistency by default
Operational complexity (compaction, tombstones, repairs)
Read performance depends heavily on data model

When to Use Cassandra

Need to handle millions of writes per second
Data is naturally partitioned (time-series, per-user)
Require multi-region active-active deployment
Availability matters more than strong consistency

When Not to Use Cassandra

Need complex queries with joins
Dataset is small (< 10 GB)
Require strong consistency for every read
Ad-hoc querying is a primary use case

Alternatives to Cassandra

DynamoDB, MongoDB, PostgreSQL

ClickHouse — Column-oriented OLAP database for blazing-fast analytics

Category: Search & Analytics

ClickHouse is an open-source, column-oriented database management system designed for online analytical processing (OLAP). Developed at Yandex, it can process analytical queries over billions of rows in real-time, making it one of the fastest analytics databases available.

Use Cases for ClickHouse

Real-time analytics dashboards
Ad tech and clickstream analysis
Time-series data at scale
Business intelligence queries
Log analytics (alternative to Elasticsearch)
A/B test result analysis

Pros of ClickHouse

Extremely fast analytical queries (columnar storage)
Excellent compression ratios
Handles billions of rows with sub-second queries
SQL-compatible query interface
Supports materialized views and real-time aggregation

Cons of ClickHouse

Not designed for OLTP or point lookups
No full ACID transactions
Updates and deletes are expensive (async mutations)
Requires careful schema design for optimal performance
Smaller ecosystem than PostgreSQL or Elasticsearch

When to Use ClickHouse

Need sub-second queries over billions of rows
Analytical/OLAP workloads with aggregations
Real-time dashboards and reporting
Want cost-effective alternative to managed analytics services

When Not to Use ClickHouse

OLTP workloads with frequent updates/deletes
Need full transaction support
Small datasets that fit in PostgreSQL
Full-text search (use Elasticsearch)

Alternatives to ClickHouse

Elasticsearch, PostgreSQL, Spark

CockroachDB — Distributed SQL database surviving any failure

Category: Databases

CockroachDB is a distributed SQL database designed to be resilient, scalable, and globally consistent. Inspired by Google's Spanner, it provides full ACID transactions across a distributed cluster while using a PostgreSQL-compatible SQL interface, making migration from PostgreSQL relatively straightforward.

Use Cases for CockroachDB

Multi-region transactional applications
Financial systems requiring strong consistency
Global SaaS platforms
Replacing sharded PostgreSQL
Disaster recovery without manual failover

Pros of CockroachDB

Distributed ACID transactions
Automatic horizontal scaling and rebalancing
PostgreSQL-compatible wire protocol
Multi-region with locality-aware reads
Survives node, rack, and datacenter failures

Cons of CockroachDB

Higher write latency due to consensus protocol
Not fully PostgreSQL-compatible (some extensions missing)
Operational learning curve for distributed SQL
More expensive than single-node PostgreSQL for small datasets

When to Use CockroachDB

Need horizontal scaling with SQL and ACID
Multi-region deployment with strong consistency
Want automatic failover without manual intervention
Outgrowing a single PostgreSQL instance

When Not to Use CockroachDB

Single-region with low data volume
Ultra-low latency requirements (single-node PG is faster)
Need full PostgreSQL extension ecosystem
Budget-constrained small projects

Alternatives to CockroachDB

PostgreSQL, Cassandra, DynamoDB

DynamoDB — Fully managed NoSQL database with single-digit millisecond latency

Category: Databases

Amazon DynamoDB is a fully managed NoSQL key-value and document database that delivers single-digit millisecond performance at any scale. It was designed based on the principles from Amazon's Dynamo paper and has been a core building block of AWS services since 2012.

Use Cases for DynamoDB

Serverless application backends
Gaming leaderboards and player state
Shopping carts and user preferences
IoT data ingestion
Session management
Ad tech and real-time bidding

Pros of DynamoDB

Fully managed — zero operational overhead
Single-digit millisecond latency at any scale
Automatic scaling with on-demand capacity
Built-in DAX caching layer
Global Tables for multi-region replication

Cons of DynamoDB

Vendor lock-in to AWS
Expensive at large scale compared to self-managed
25 GSI limit per table
Item size limited to 400 KB
Complex pricing model (RCU/WCU)

When to Use DynamoDB

Building on AWS and want zero ops overhead
Need predictable single-digit millisecond performance
Access patterns are known and well-defined
Serverless architectures with Lambda

When Not to Use DynamoDB

Need complex relational queries or joins
Want multi-cloud or vendor-neutral solution
Data model is highly relational
Need full-text search capabilities

Alternatives to DynamoDB

Cassandra, MongoDB, Redis

Elasticsearch — Distributed search and analytics engine built on Apache Lucene

Category: Search & Analytics

Elasticsearch is a distributed, RESTful search and analytics engine built on top of Apache Lucene. It stores JSON documents and makes them searchable in near real-time, excelling at full-text search, log analytics, and complex aggregations across large datasets.

Use Cases for Elasticsearch

Full-text search across large datasets
Log and event data analysis (ELK stack)
E-commerce product search
Application performance monitoring
Security analytics (SIEM)
Autocomplete and suggestions

Pros of Elasticsearch

Near real-time full-text search
Horizontally scalable with automatic sharding
Rich query DSL with aggregations
Schema-free JSON documents
Powerful text analysis and tokenization

Cons of Elasticsearch

Not a primary data store (no ACID transactions)
High memory consumption for indexing
Split-brain risk without careful cluster config
Complex capacity planning and tuning
License changes (SSPL) may affect deployment

When to Use Elasticsearch

Need full-text search with relevance scoring
Log aggregation and analytics (ELK/EFK stack)
Real-time search across millions of documents
Complex aggregations and faceted search

When Not to Use Elasticsearch

Primary data store for transactional workloads
Simple key-value lookups
Strong consistency requirements
Limited infrastructure budget (resource-hungry)

Alternatives to Elasticsearch

ClickHouse, PostgreSQL, MongoDB

etcd — Distributed key-value store for shared configuration and service discovery

Category: Coordination

etcd is a distributed, reliable key-value store designed for the most critical data of a distributed system. Written in Go, it uses the Raft consensus algorithm to ensure strong consistency across all nodes in a cluster, making it the backbone of Kubernetes and many other cloud-native systems.

Use Cases for etcd

Kubernetes cluster state storage
Service discovery
Distributed configuration
Leader election
Feature flag management
Distributed locking

Pros of etcd

Strong consistency via Raft consensus
Simple HTTP/gRPC API
Watch API for real-time change notifications
Lease-based TTL for ephemeral keys
Foundation of Kubernetes control plane

Cons of etcd

Not designed for large data volumes (recommended < 8GB)
Write latency depends on cluster size and network
All data must fit in memory
Limited query capabilities (prefix-based only)
Compaction needed to prevent unbounded growth

When to Use etcd

Kubernetes or cloud-native infrastructure
Need strongly consistent configuration store
Service discovery with health checking
Distributed coordination in Go-based systems

When Not to Use etcd

General-purpose data storage
Large datasets (> 8 GB)
High write throughput requirements
Complex query patterns

Alternatives to etcd

ZooKeeper, Kafka

Flink — Stateful stream processing framework for real-time computation

Category: Stream Processing

Apache Flink is a distributed stream processing framework designed for stateful computations over unbounded data streams. Unlike micro-batch systems, Flink processes events one at a time with true streaming semantics, achieving low latency while maintaining exactly-once state consistency.

Use Cases for Flink

Real-time fraud detection
Event-driven applications
Real-time ETL pipelines
Complex event processing (CEP)
Real-time machine learning inference
Continuous monitoring and alerting

Pros of Flink

True event-time processing with watermarks
Exactly-once state consistency
Low-latency stream processing
Sophisticated windowing (tumbling, sliding, session)
Unified batch and stream processing

Cons of Flink

Steep learning curve
Complex cluster management
Checkpointing can impact latency under load
Smaller community than Spark
Resource-intensive for stateful operations

When to Use Flink

Need true real-time processing (low latency)
Complex event patterns with event-time semantics
Exactly-once processing guarantees required
Stateful stream processing (aggregations, joins)

When Not to Use Flink

Simple batch processing jobs
Small data volumes that don't justify complexity
Team lacks streaming expertise
Ad-hoc analytical queries (use Spark or ClickHouse)

Alternatives to Flink

Spark, Kafka Streams, Kafka

Kafka Streams — Lightweight stream processing library built on Apache Kafka

Category: Stream Processing

Kafka Streams is a client library for building real-time, event-driven applications and microservices that process data stored in Apache Kafka. Unlike Flink or Spark, it doesn't require a separate processing cluster — it runs as a regular Java application, making deployment and scaling straightforward.

Use Cases for Kafka Streams

Real-time data transformation
Event-driven microservices
Stream-table joins
Real-time aggregations and windowing
Data enrichment pipelines
Lightweight ETL within Kafka

Pros of Kafka Streams

No separate cluster needed — runs as a library
Exactly-once processing semantics
Elastic scaling via Kafka consumer groups
Interactive queries on local state stores
Simple deployment (just a JVM application)

Cons of Kafka Streams

Tied to Kafka (input and output must be Kafka topics)
JVM only (Java/Kotlin/Scala)
Limited to Kafka's partitioning model
State stores can grow large on disk
Less feature-rich than Flink for complex processing

When to Use Kafka Streams

Already using Kafka and need simple stream processing
Want to avoid managing a separate processing cluster
Building event-driven microservices
Need exactly-once within the Kafka ecosystem

When Not to Use Kafka Streams

Processing data from non-Kafka sources
Need advanced CEP or event-time processing
Python or non-JVM tech stack
Complex multi-stream joins and windowing

Alternatives to Kafka Streams

Kafka, Flink, Spark

Kafka — Distributed event streaming platform for real-time data pipelines

Category: Messaging

Apache Kafka is a distributed event streaming platform capable of handling trillions of events per day. Originally developed at LinkedIn, Kafka is designed as a distributed commit log where messages are persisted to disk and replicated across brokers for fault tolerance.

Use Cases for Kafka

Event-driven microservices
Real-time data pipelines
Log aggregation
Change data capture (CDC)
Stream processing
Activity tracking and metrics

Pros of Kafka

Extremely high throughput (millions of messages/sec)
Durable message storage with configurable retention
Horizontal scaling via partitions
Strong ordering guarantee within partitions
Rich ecosystem (Connect, Streams, Schema Registry)

Cons of Kafka

Operational complexity (ZooKeeper/KRaft, brokers, topics)
Not ideal for low-latency request-reply patterns
Consumer group rebalancing can cause pauses
No built-in message routing or filtering
Steep learning curve

When to Use Kafka

Need durable, ordered event streaming
Building event-driven or CQRS architectures
High-throughput log or data pipeline ingestion
Decoupling producers and consumers at scale

When Not to Use Kafka

Simple task queues with few messages
Need complex message routing (use RabbitMQ)
Request-reply messaging patterns
Small team without operational expertise

Alternatives to Kafka

RabbitMQ, Kafka Streams, Flink

Kong — Cloud-native API gateway built on NGINX

Category: API Infrastructure

Kong is a cloud-native, open-source API gateway and microservice management layer built on top of NGINX. It provides a centralized entry point for all API traffic, handling cross-cutting concerns like authentication, rate limiting, logging, and traffic management through a plugin architecture.

Use Cases for Kong

Centralized API gateway
Authentication and authorization (OAuth, JWT, API keys)
Rate limiting and throttling
Request/response transformation
API analytics and monitoring
Service mesh with Kong Mesh

Pros of Kong

Built on NGINX — inherits high performance
Rich plugin ecosystem (100+ plugins)
Supports declarative and database-backed configuration
Kubernetes-native with Ingress Controller
Open-source core with enterprise features

Cons of Kong

Adds latency compared to direct NGINX
Plugin ecosystem quality varies
Enterprise features require paid license
Configuration complexity at scale
Database dependency (Postgres/Cassandra) for some modes

When to Use Kong

Need a centralized API gateway with auth and rate limiting
Managing multiple APIs/microservices behind one entry point
Kubernetes environments needing an ingress controller
Want plugin-based extensibility without custom code

When Not to Use Kong

Simple reverse proxy needs (use NGINX directly)
Ultra-low latency where every microsecond matters
Tight budget constraints (enterprise features are paid)
Simple static site serving

Alternatives to Kong

NGINX, Kafka

Memcached — High-performance distributed memory caching system

Category: Caching

Memcached is a free, open-source, high-performance distributed memory caching system designed to speed up dynamic web applications by reducing database load. It stores data as simple key-value pairs in RAM, using a hash table distributed across multiple machines.

Use Cases for Memcached

Database query result caching
HTML fragment caching
Session storage
API response caching
Object caching for web apps

Pros of Memcached

Simple and predictable — just key-value strings
Multi-threaded architecture for high throughput
Consistent hashing for easy horizontal scaling
Minimal memory overhead per item
Mature and battle-tested at massive scale

Cons of Memcached

No persistence — data lost on restart
Only supports string values (no rich data structures)
No built-in replication
No pub/sub or advanced features
Limited to key-value operations

When to Use Memcached

Simple caching of serialized objects or query results
Need multi-threaded performance for high concurrency
Want a lightweight, low-overhead cache layer
Already have a durable primary store

When Not to Use Memcached

Need data persistence or durability
Require rich data structures (use Redis)
Need pub/sub or stream processing
Want built-in replication and failover

Alternatives to Memcached

Redis, NGINX

MongoDB — Document database for flexible schemas and rapid development

Category: Databases

MongoDB is a popular document-oriented NoSQL database that stores data in flexible, JSON-like BSON documents. Unlike relational databases, MongoDB doesn't require a predefined schema, allowing fields to vary from document to document and making it ideal for applications with evolving data models.

Use Cases for MongoDB

Content management systems
Product catalogs with varied attributes
Mobile app backends
Real-time analytics
User profiles and personalization
Prototyping and rapid iteration

Pros of MongoDB

Flexible schema — no migrations needed
Rich query language with aggregation pipeline
Horizontal scaling via built-in sharding
Native JSON document model
Multi-document ACID transactions

Cons of MongoDB

Joins are limited (no server-side joins pre-5.0, $lookup is costly)
Memory-mapped storage can consume significant RAM
Denormalization leads to data duplication
Write amplification with large documents
Sharding requires careful key selection

When to Use MongoDB

Schema evolves frequently (startups, MVPs)
Data is naturally document-shaped (JSON)
Need flexible querying over semi-structured data
Rapid prototyping with changing requirements

When Not to Use MongoDB

Highly relational data with many joins
Need strong multi-row transactions across collections
Write-heavy append-only workloads (prefer Cassandra)
Strict schema enforcement is required

Alternatives to MongoDB

PostgreSQL, DynamoDB, Elasticsearch

NGINX — High-performance web server, reverse proxy, and load balancer

Category: API Infrastructure

NGINX is an open-source, high-performance HTTP server and reverse proxy that powers a significant portion of the internet. Its event-driven, non-blocking architecture allows it to handle tens of thousands of simultaneous connections with minimal memory usage, making it the go-to choice for web serving and load balancing.

Use Cases for NGINX

Reverse proxy and load balancing
SSL/TLS termination
Static file serving
API gateway
Rate limiting and access control
HTTP caching

Pros of NGINX

Extremely high concurrency (event-driven, non-blocking)
Low memory footprint
Battle-tested at massive scale
Rich module ecosystem
Supports HTTP, TCP, and UDP load balancing

Cons of NGINX

Configuration can be complex for advanced use cases
Dynamic reconfiguration requires reload
Limited built-in API management features
Free version lacks some enterprise features
Lua scripting for advanced logic adds complexity

When to Use NGINX

Need a reverse proxy in front of application servers
SSL termination and HTTP/2 support
Serving static files alongside dynamic content
Simple load balancing without a service mesh

When Not to Use NGINX

Need a full API gateway with auth, rate limiting, analytics
Service mesh with dynamic service discovery
Complex traffic routing requiring programmatic control
GraphQL-specific gateway features

Alternatives to NGINX

Kong, Kafka

PostgreSQL — Advanced open-source relational database with extensibility

Category: Databases

PostgreSQL is the world's most advanced open-source relational database, known for its reliability, robustness, and rich feature set. It supports complex SQL queries, foreign keys, triggers, views, stored procedures, and a wide range of data types including JSON, arrays, and custom types.

Use Cases for PostgreSQL

OLTP transactional workloads
Complex queries with joins and aggregations
Geospatial data (PostGIS)
Full-text search
JSON/JSONB document storage
Time-series data (with TimescaleDB)

Pros of PostgreSQL

Full ACID compliance with MVCC
Extremely extensible (custom types, functions, extensions)
Excellent SQL standard compliance
Rich indexing (B-tree, GIN, GiST, BRIN)
Strong community and ecosystem

Cons of PostgreSQL

Vertical scaling has limits
Write-heavy workloads can struggle without tuning
Replication is asynchronous by default
Sharding requires extensions like Citus
VACUUM overhead for long-running transactions

When to Use PostgreSQL

Need strong consistency and ACID transactions
Complex relational data with joins
Mixed workloads (relational + JSON + full-text search)
Geospatial queries

When Not to Use PostgreSQL

Need automatic horizontal sharding at massive scale
Simple key-value access patterns only
Ultra-low latency requirements (< 1ms)
Append-only time-series at millions of events/sec

Alternatives to PostgreSQL

CockroachDB, MongoDB, Cassandra

RabbitMQ — Flexible message broker with advanced routing capabilities

Category: Messaging

RabbitMQ is a widely deployed open-source message broker that implements the Advanced Message Queuing Protocol (AMQP). It acts as a middleman for messaging between applications, providing sophisticated routing, reliable delivery, and flexible patterns for distributed systems.

Use Cases for RabbitMQ

Task queues and background job processing
Request-reply messaging patterns
Complex message routing (topic, headers, fanout)
Microservice communication
Delayed and scheduled message delivery
Priority queues

Pros of RabbitMQ

Rich routing with exchanges (direct, topic, fanout, headers)
Supports multiple protocols (AMQP, MQTT, STOMP)
Message acknowledgment and dead-letter queues
Priority queues and TTL support
Easy to set up and operate for small-medium scale

Cons of RabbitMQ

Lower throughput than Kafka for streaming workloads
Messages are deleted after consumption (not replayable)
Clustering can be complex at large scale
Memory pressure under high message backlog
Not designed for event sourcing or log-based systems

When to Use RabbitMQ

Need flexible message routing patterns
Task queues with acknowledgment and retries
Request-reply or RPC patterns over messaging
Small-to-medium scale with simpler operations

When Not to Use RabbitMQ

Need event replay or long-term message storage
Ultra-high throughput streaming (millions/sec)
Event sourcing or CQRS architectures
Need strong message ordering across partitions

Alternatives to RabbitMQ

Kafka, Redis

Redis — In-memory data structure store for caching and real-time workloads

Category: Caching

Redis is an open-source, in-memory key-value store known for its exceptional speed and versatile data structures. Unlike simple key-value caches, Redis supports strings, hashes, lists, sets, sorted sets, bitmaps, HyperLogLogs, and streams — making it far more than just a cache.

Use Cases for Redis

Session storage
Rate limiting
Leaderboards & rankings
Real-time analytics
Pub/Sub messaging
Distributed locks
Caching hot data

Pros of Redis

Sub-millisecond latency
Rich data structures (lists, sets, sorted sets, hashes, streams)
Built-in replication and Lua scripting
Persistence options (RDB snapshots, AOF)
Cluster mode for horizontal scaling

Cons of Redis

RAM-bound — dataset must fit in memory
Single-threaded command execution
Cluster mode adds operational complexity
No built-in query language for complex queries

When to Use Redis

Need sub-millisecond reads/writes
Caching hot data in front of a database
Real-time counters, leaderboards, or rate limiters
Session management across multiple app servers

When Not to Use Redis

Dataset significantly exceeds available RAM
Need full ACID transactions with joins
Primary long-term storage for critical data
Complex relational queries

Alternatives to Redis

Memcached, DynamoDB, PostgreSQL

Spark — Unified analytics engine for large-scale data processing

Category: Stream Processing

Apache Spark is a unified analytics engine for large-scale data processing, providing high-level APIs in Java, Scala, Python, and R. It was developed to overcome the limitations of Hadoop MapReduce by introducing in-memory computing, which makes iterative algorithms and interactive queries orders of magnitude faster.

Use Cases for Spark

Large-scale batch ETL
Machine learning pipelines (MLlib)
Interactive SQL analytics (Spark SQL)
Graph processing (GraphX)
Micro-batch stream processing (Structured Streaming)
Data lake processing

Pros of Spark

Unified API for batch, streaming, SQL, and ML
In-memory processing for iterative workloads
Rich ecosystem and massive community
Supports multiple languages (Scala, Python, Java, R, SQL)
Catalyst optimizer for SQL query optimization

Cons of Spark

Micro-batch streaming adds latency (seconds, not milliseconds)
High memory consumption
Complex tuning for optimal performance
Not ideal for small datasets
Shuffle operations can be expensive

When to Use Spark

Large-scale batch data processing
Need unified batch + streaming + ML platform
Interactive SQL queries on large datasets
Team is comfortable with JVM or Python ecosystem

When Not to Use Spark

Need true sub-second stream processing (use Flink)
Simple ETL that fits in a single machine
Real-time event processing requiring low latency
OLTP workloads

Alternatives to Spark

Flink, Kafka Streams, ClickHouse

ZooKeeper — Centralized coordination service for distributed systems

Category: Coordination

Apache ZooKeeper is a centralized service for maintaining configuration information, naming, distributed synchronization, and group services. It provides a simple set of primitives that distributed applications can build upon to implement higher-level coordination patterns like leader election, distributed locks, and barriers.

Use Cases for ZooKeeper

Leader election
Distributed configuration management
Service discovery
Distributed locking
Cluster membership tracking
Barrier synchronization

Pros of ZooKeeper

Strong consistency via ZAB consensus protocol
Ordered, sequential operations
Ephemeral nodes for failure detection
Watch mechanism for real-time notifications
Battle-tested in production (Kafka, HBase, Solr)

Cons of ZooKeeper

Not designed for large data storage
Write throughput is limited (leader bottleneck)
Operational complexity with quorum management
Java-based — can have GC pause issues
Being replaced by newer alternatives (etcd, KRaft)

When to Use ZooKeeper

Need leader election for distributed services
Existing Hadoop/Kafka ecosystem dependency
Distributed configuration that must be consistent
Service coordination requiring strong ordering

When Not to Use ZooKeeper

General-purpose data storage
High write throughput requirements
Greenfield projects (consider etcd instead)
Simple service discovery (consider Consul)

Alternatives to ZooKeeper

etcd, Kafka

Cracking Walnuts

Key Technologies

Cassandra

ClickHouse

CockroachDB

DynamoDB

Elasticsearch

etcd

Flink

Kafka Streams

Kafka

Kong

Memcached

MongoDB

NGINX

PostgreSQL

RabbitMQ

Redis

Spark

ZooKeeper

Cassandra — Distributed wide-column store for massive write throughput

Use Cases for Cassandra

Pros of Cassandra

Cons of Cassandra

When to Use Cassandra

When Not to Use Cassandra

Alternatives to Cassandra

ClickHouse — Column-oriented OLAP database for blazing-fast analytics

Use Cases for ClickHouse

Pros of ClickHouse

Cons of ClickHouse

When to Use ClickHouse

When Not to Use ClickHouse

Alternatives to ClickHouse

CockroachDB — Distributed SQL database surviving any failure

Use Cases for CockroachDB

Pros of CockroachDB

Cons of CockroachDB

When to Use CockroachDB

When Not to Use CockroachDB

Alternatives to CockroachDB

DynamoDB — Fully managed NoSQL database with single-digit millisecond latency

Use Cases for DynamoDB

Pros of DynamoDB

Cons of DynamoDB

When to Use DynamoDB

When Not to Use DynamoDB

Alternatives to DynamoDB

Elasticsearch — Distributed search and analytics engine built on Apache Lucene

Use Cases for Elasticsearch

Pros of Elasticsearch

Cons of Elasticsearch

When to Use Elasticsearch

When Not to Use Elasticsearch

Alternatives to Elasticsearch

etcd — Distributed key-value store for shared configuration and service discovery

Use Cases for etcd

Pros of etcd

Cons of etcd

When to Use etcd

When Not to Use etcd

Alternatives to etcd

Flink — Stateful stream processing framework for real-time computation

Use Cases for Flink

Pros of Flink

Cons of Flink

When to Use Flink

When Not to Use Flink

Alternatives to Flink

Kafka Streams — Lightweight stream processing library built on Apache Kafka

Use Cases for Kafka Streams

Pros of Kafka Streams

Cons of Kafka Streams

When to Use Kafka Streams

When Not to Use Kafka Streams

Alternatives to Kafka Streams

Kafka — Distributed event streaming platform for real-time data pipelines

Use Cases for Kafka

Pros of Kafka

Cons of Kafka