Coordination & InfrastructureTech 13 of 15

Observability

VictoriaLogs

The log database that indexes everything without the Elasticsearch bill

Use Cases

Centralized log aggregation for Kubernetes clustersHigh-volume operational logging (millions of log lines/sec)Full-text log search across high-cardinality fieldsIncident investigation with trace-to-log correlationMulti-tenant log storage for platform teamsReplacing Elasticsearch for log storage at lower costComplementing VictoriaMetrics metrics with the same operational model

Architecture

Why It Exists

Log management at scale has historically been a choice between two painful options. Elasticsearch gives full-text search and powerful aggregation, but the infrastructure cost is brutal. Running an Elasticsearch cluster large enough to ingest terabytes of logs per day requires dozens of nodes with NVMe storage, careful shard management, ILM policies, and a dedicated team to keep it running. Grafana Loki takes the opposite approach: index almost nothing, store chunks on S3, and brute-force grep when a query needs content search. This is cheap but slow. A needle-in-haystack query across a broad label set can take 10+ seconds or time out entirely.

VictoriaLogs occupies the middle ground. It indexes every log field automatically using bloom filters, which are dramatically smaller than Elasticsearch's inverted indexes (2 bytes per unique token vs 8+ bytes per token occurrence). This means index-assisted search at a fraction of the cost. It is not as fast as Elasticsearch for complex aggregation queries, but for the query patterns that matter during incident response (find this error message, find logs with this trace_id, find all timeout errors in the payment service), it matches or beats Elasticsearch query speed while using 30x less RAM.

The project comes from the same team that built VictoriaMetrics, and it shows. The single-binary deployment, the zero-config schemaless approach, the cluster architecture (vlinsert/vlselect/vlstorage), and the operational patterns are virtually identical. Teams already running VictoriaMetrics can add VictoriaLogs without learning a new operational model.

How the Storage Engine Works

When a log line arrives at vlstorage, it goes through several processing steps before hitting disk.

Tokenization. Each log field (message, service, level, trace_id, any structured field) gets split into tokens (words). The tokenizer handles common log patterns: JSON field names, IP addresses, UUIDs, stack trace lines. Each unique token is recorded.

Bloom filter construction. For each data block (a batch of log lines within a time partition), VictoriaLogs builds a bloom filter from all tokens found in that block's log fields. A bloom filter is a probabilistic data structure that answers "is this token in this block?" with either "definitely not" or "possibly yes." False positives are possible but false negatives are not. At 2 bytes per unique token, a block with 10,000 unique tokens needs only 20 KB of bloom filter. Compare this to Elasticsearch, where an inverted index for the same block stores every token occurrence with position data, easily reaching megabytes.

Columnar compression. Log fields are stored column-by-column, not row-by-row. All service values for a block are stored together, all message values together, all timestamps together. This enables much better compression because similar values cluster together. Timestamps compress extremely well (delta-of-delta encoding, similar to VictoriaMetrics). Repeated field values (the same service name appearing thousands of times) compress to nearly nothing.

Daily partitions. Data is organized into partitions based on calendar day (UTC). Each partition contains all logs for that day. When the retention period expires, the entire partition directory gets deleted. No scanning, no garbage collection, no tombstones. This makes retention cleanup instant regardless of data volume.

Query execution. When a LogsQL query arrives, vlselect parses it and fans it out to all vlstorage nodes. On each node, the query executor:

Identifies which daily partitions fall within the time range
For each partition, checks bloom filters to find blocks that might contain the search terms
Reads only the matching blocks from disk
Applies the full query filter to the actual log lines
Returns results to vlselect for merging

The bloom filter step is critical. It skips the vast majority of blocks without reading them from disk. For a query searching for a specific trace_id (a unique string), bloom filters eliminate 99%+ of blocks immediately. This is why VictoriaLogs achieves sub-second needle-in-haystack queries while Loki, which lacks this index, takes 12 seconds on the same dataset.

Single-Node vs Cluster

Single-node is a single binary that handles ingestion, storage, and queries. For most deployments up to a few hundred GB of logs per day, this is sufficient. A single node on modern hardware (8 cores, 32GB RAM, NVMe) handles ingestion rates that would require a multi-node Loki or Elasticsearch deployment. The schemaless, zero-config nature means deployment is download, configure retention, start.

Cluster mode splits into three components, mirroring VictoriaMetrics cluster architecture:

vlinsert: Accepts incoming logs via multiple protocols (Elasticsearch bulk API, Loki push API, Syslog, OTLP, Journald, and others). Distributes logs across vlstorage nodes. Stateless and horizontally scalable behind a load balancer.
vlstorage: Stores log data on local disk. Handles tokenization, bloom filter construction, compression, and local query execution. Each node is essentially an independent single-node VictoriaLogs. This shared-nothing design means adding a vlstorage node adds capacity linearly with no rebalancing.
vlselect: Accepts queries, fans them out to all vlstorage nodes, merges results. Stateless. Scale based on query concurrency.

The ingestion protocol flexibility is worth highlighting. vlinsert accepts the Elasticsearch bulk API, which means existing Filebeat, Fluentd, Logstash, and Vector deployments can point at VictoriaLogs with a URL change. It also accepts Loki push API (Promtail, Grafana Agent), Syslog, OTLP (OpenTelemetry), and Journald. This makes migration from either Elasticsearch or Loki straightforward.

LogsQL

LogsQL is VictoriaLogs' query language. It combines structured field filtering with full-text search in a single syntax.

Basic structure: A query is a chain of filters separated by AND. Each filter can be a field match, a time range, a text search, or a regex.

# Find error logs from checkout service in the last 5 minutes
_time:5m AND service:checkout AND level:error

# Full-text search across all fields
_time:1h AND "connection refused"

# Regex match on a field
_time:1d AND message:~"timeout after [0-9]+ ms"

# Search by trace ID (bloom filter makes this fast)
trace_id:"4bf92f3577b34da6a3ce929d0e0e4736"

# Combine structured and unstructured search
_time:30m AND service:payment AND _stream:{env="prod"} AND "stripe" AND status_code:5*

Pipes transform results after filtering:

# Count errors by service
_time:1h AND level:error | stats by(service) count() as error_count

# Find top 10 slowest requests
_time:1h AND _stream:{level="info"} | json | duration:>2000 | sort by(duration) desc | limit 10

# Unique values of a field
_time:1d AND service:checkout | uniq by(status_code)

The key difference from Loki's LogQL: in LogsQL, full-text search is a first-class operation backed by bloom filter indexes. In LogQL, content filtering (|= "pattern") happens after label selection and is a brute-force line-by-line scan. At small scale this difference is invisible. At 500K services with broad queries, it determines whether a search takes 900ms or 12 seconds.

Production Architecture

In production on Kubernetes, deploy vlinsert and vlselect as Deployments (stateless, scale horizontally) and vlstorage as a StatefulSet (stateful, needs persistent volumes).

Storage planning: vlstorage nodes need local NVMe or SSD. Plan capacity for the full hot+warm retention window. If retaining all logs for 30 days at 260 TB/day compressed, the total vlstorage capacity needs to be roughly 7.8 PB across all nodes. Use instance types with large local NVMe (AWS i3en, i4i, or d3en for cost-optimized HDD warm tier).

Ingestion protocols: Configure vlinsert to accept OTLP (from the OTel Collector fleet), Elasticsearch bulk API (for legacy Filebeat/Logstash pipelines), and Loki push API (for Promtail agents). This multi-protocol support means migration from Elasticsearch or Loki can happen incrementally, one team at a time.

Grafana integration: Install the VictoriaLogs Grafana datasource plugin. This enables log panels in Grafana dashboards alongside VictoriaMetrics metric panels and Grafana Tempo trace panels. The plugin supports LogsQL syntax with autocomplete. For trace-to-log correlation, configure the datasource to link on trace_id field.

Backups and cold storage: Use vmbackup to snapshot vlstorage data to S3. Schedule daily backups for the cold tier. For compliance/audit retention beyond the vlstorage retention window, vmbackup snapshots on S3 Glacier Deep Archive provide long-term storage at ~$1/TB/month. These snapshots can be restored to a temporary vlstorage instance for querying when needed.

Monitoring VictoriaLogs itself: Key metrics to track on vlstorage nodes:

vl_rows_ingested_total: ingestion rate. Drops indicate upstream collection issues.
vl_disk_usage_bytes: local disk consumption. Alert before disk fills.
vl_slow_queries_total: queries exceeding the slow threshold. Often caused by overly broad time ranges or missing filters.
vl_bloom_filter_cache_hit_rate: bloom filter cache effectiveness. Low hit rates mean too many unique tokens are evicting cached filters.

Decision Criteria

Criteria	VictoriaLogs	Grafana Loki	Elasticsearch
Indexing	Bloom filters on all tokens (2 bytes/token)	Labels only. Content not indexed.	Full inverted index on all tokens
Content search speed	Fast (index-assisted bloom filter skip)	Slow (brute-force grep after label filter)	Fastest (inverted index direct lookup)
Ingestion throughput	66 MB/s on 4 vCPU benchmark	20 MB/s on same hardware	High but CPU-intensive (index building)
RAM usage	1.3 GiB (500 GB benchmark)	6-7 GiB (same benchmark)	30-40 GiB (same scale, inverted index)
Storage backend	Local NVMe/SSD	S3-native (object storage)	Local NVMe/SSD
Storage efficiency	40% less disk than Loki	S3 chunk storage with label index	Largest footprint (inverted index overhead)
Schema requirements	None (schemaless, auto-indexes all)	Label design required upfront	Index templates, mappings
High cardinality	Handles natively	Struggles with high-cardinality labels	Handles well, at high resource cost
Query language	LogsQL (SQL-like + full-text)	LogQL (grep-like, label-first)	KQL / Lucene (powerful, complex)
Tiered storage	Local disk + vmbackup to S3	S3-native with lifecycle policies	Local disk + snapshot to S3
Cluster scaling	Linear (vlinsert/vlselect/vlstorage)	Horizontal (distributor/ingester/querier)	Shard-based (rebalancing on scale)
Operational complexity	Low (single binary, zero-config)	Medium (S3 simplifies storage)	High (shard mgmt, ILM, JVM tuning)
Grafana integration	Datasource plugin	Native	Datasource plugin
Best for	High-volume logs with content search	Label-filtered logs, cost-sensitive	Full-text analytics, log dashboards

Capacity Planning

Single-node sizing guidelines:

Daily Log Volume	Retention	RAM	CPU	Local Disk
50 GB/day	30 days	4 GB	4 cores	1 TB SSD
500 GB/day	30 days	8 GB	8 cores	10 TB NVMe
5 TB/day	15 days	32 GB	16 cores	50 TB NVMe

For the cluster version, each component has different bottlenecks:

Component	Instance	Throughput	Bottleneck
vlinsert	c6g.xlarge (4 vCPU, 8GB)	~66 MB/s ingestion (benchmark)	CPU (parsing, protocol handling)
vlstorage	i3en.6xlarge (24 vCPU, 96GB, 2x7.5TB NVMe)	~200 MB/s ingestion + storage	Disk I/O (bloom filter writes, block scans)
vlselect	r6g.xlarge (4 vCPU, 32GB)	~50 concurrent queries	Memory (result merging from fan-out)

Disk I/O matters. Bloom filter lookups and block scans are sequential reads. NVMe provides the throughput needed for fast queries. HDD can work for warm data where query latency requirements are relaxed (2-5s acceptable), but hot data should always be on NVMe/SSD.

Network bandwidth: vlselect fans queries to all vlstorage nodes. High-concurrency query workloads generate significant internal network traffic. Budget 10 Gbps between vlselect and vlstorage nodes. If co-located in the same availability zone, this is rarely a bottleneck. Cross-AZ query fan-out adds latency and costs.

Failure Scenarios

Scenario 1: vlstorage Disk Full from Uncontrolled Log Volume

Trigger: A deployment bug in a service causes it to emit 10,000 log lines per second instead of the normal 100. With 500 instances of that service, the log volume spikes 50x. vlstorage local NVMe fills up before the retention period cleans old partitions.

Impact: vlstorage stops accepting new data. vlinsert buffers briefly, then starts returning errors to Kafka consumers. Consumer lag grows. If the burst continues, vlinsert memory usage climbs from buffering. Other services' logs are also affected because all data flows through the same vlstorage nodes.

Detection: Monitor vl_disk_usage_bytes and vl_rows_ingested_total per source service. Alert when disk usage exceeds 80% or when a single service's ingestion rate spikes more than 5x above its 24-hour average.

Recovery: Identify the noisy service from ingestion metrics. Add a rate limit at the OTel Collector or vlinsert level to cap that service's ingestion. If disk is critically full, manually trigger early partition cleanup by temporarily reducing -retentionPeriod. Once the spike subsides, restore the original retention and investigate the root cause in the offending service. Long-term: implement per-tenant or per-service ingestion rate limits at vlinsert to prevent one service from consuming disproportionate storage.

Scenario 2: Bloom Filter Cache Thrashing Under High-Cardinality Queries

Trigger: An operator runs a broad query with a time range of 7 days across all services, searching for a specific UUID. The query touches every daily partition on every vlstorage node. Each node loads bloom filters for all partitions into the cache, evicting bloom filters for the current day's data.

Impact: Subsequent queries for recent data (the most common query pattern) miss the bloom filter cache and hit disk for bloom filter reads. Query latency for all concurrent users spikes from sub-second to 3-5 seconds. If multiple operators run similar broad queries simultaneously, the cache thrash compounds and query performance degrades across the cluster.

Detection: Monitor vl_bloom_filter_cache_hit_rate. Alert when it drops below 80%. Track vl_slow_queries_total for correlated spikes.

Recovery: The cache recovers automatically as normal query patterns re-warm it. For immediate relief, cancel the broad queries. Long-term: set query time range limits on vlselect to prevent queries spanning more than 3 days by default. Increase the bloom filter cache size on vlstorage nodes if memory permits. For forensic queries that genuinely need long time ranges, direct them to a dedicated vlselect instance that does not share cache pressure with operational queries.

Scenario 3: vlinsert Overwhelmed During Kafka Consumer Lag Recovery

Trigger: A vlinsert restart or brief network issue causes Kafka consumer lag to accumulate on the logs-raw topic. When vlinsert reconnects, it attempts to consume the backlog at maximum speed. The ingestion burst exceeds vlstorage's sustainable write rate.

Impact: vlstorage write latency increases as the local NVMe I/O saturates from the burst. vlinsert buffers in memory while waiting for vlstorage acknowledgments. If the backlog is large enough, vlinsert hits its memory limit and starts dropping data or crashing with OOM.

Detection: Monitor Kafka consumer lag on the logs-raw topic. Alert when lag exceeds 5 minutes of data. Track vlinsert memory usage and vlstorage write latency.

Recovery: Configure vlinsert with a rate limiter that caps ingestion at 120% of the steady-state rate. This ensures the backlog gets consumed (faster than production rate) without overwhelming vlstorage. The Kafka consumer lag will drain gradually over hours instead of minutes, but no data is lost and operational queries remain fast throughout. Set Kafka retention on logs-raw to at least 24 hours so the backlog survives even extended vlinsert outages.

Pros

• Indexes all log fields automatically via bloom filters. No schema design or label planning required.
• Uses 30x less RAM and 15x less disk than Elasticsearch on the same workload
• 3x higher ingestion throughput and 87% less RAM than Grafana Loki in benchmarks
• Single binary, zero-config deployment. Same operational simplicity as VictoriaMetrics.
• LogsQL query language with full-text search built in, not bolted on
• Handles high-cardinality log fields natively without performance degradation
• Cluster mode with vlinsert/vlselect/vlstorage for linear horizontal scaling

Cons

• Younger project than Elasticsearch and Loki, smaller community and ecosystem
• No native S3/object storage backend. Uses local disk. Cold tier requires vmbackup to S3.
• Grafana integration via datasource plugin, not native like Loki
• LogsQL is powerful but less widely known than Elasticsearch KQL or Loki LogQL
• Fewer third-party integrations and managed service options compared to Elasticsearch

When to use

• High-volume log ingestion where Elasticsearch cost is prohibitive
• Log queries frequently search content, not just labels (where Loki brute-force grep is slow)
• Already running VictoriaMetrics and want the same operational model for logs
• Need full-text search on logs without the operational overhead of Elasticsearch
• High-cardinality log fields (trace_id, user_id, request_id) are common query targets

When NOT to use

• Need S3-native storage with automatic lifecycle tiering (Loki is simpler here)
• Log analytics and dashboards built from log content (Elasticsearch excels at aggregation queries)
• Small team that wants a fully managed solution (Elasticsearch Service, Grafana Cloud Loki)
• Tight Grafana ecosystem integration is a hard requirement (Loki has native support)

Key Points

•VictoriaLogs splits every log field into words (tokens) and creates bloom filters from them. Each unique token costs roughly 2 bytes of bloom filter storage. When a query searches for a specific string, the bloom filter checks which data blocks might contain it and skips everything else. This is fundamentally different from Loki (which indexes labels only and brute-force scans content) and Elasticsearch (which builds a full inverted index on every token). Bloom filters sit in a sweet spot: much smaller than inverted indexes but still enable index-assisted search.
•Storage uses daily UTC partitions. All logs for a single calendar day go into one partition. Retention is enforced by dropping entire partitions, not by scanning and deleting individual records. This makes retention cleanup fast and predictable, regardless of data volume. The -retentionPeriod flag on vlstorage controls this.
•The cluster architecture mirrors VictoriaMetrics exactly. vlinsert accepts log data and distributes it across vlstorage nodes. vlstorage tokenizes fields, builds bloom filters, compresses data, and writes to local NVMe. vlselect fans queries out to all vlstorage nodes, each checks its bloom filters, scans only matching blocks, and returns results. Each component scales independently.
•LogsQL combines label-based filtering with full-text search in a single query language. A query like _time:5m AND service:checkout AND connection refused first narrows by time range and service label (fast index lookup), then searches for the phrase in matching blocks using bloom filters. This avoids the Loki pattern of brute-force grep after label filtering.
•VictoriaLogs is schemaless. There is no index template, no mapping, no label schema to design upfront. Throw any JSON log at it and every field gets indexed automatically. This eliminates the Elasticsearch operational tax of managing index templates, and avoids the Loki constraint of carefully choosing which fields become labels.
•In benchmarks on a 500 GB / 7-day workload (4 vCPU, 8 GiB RAM): VictoriaLogs achieved 3x higher ingestion throughput than Loki (66 MB/s vs 20 MB/s), 94% faster query latency, 87% less RAM (1.3 GiB vs 6-7 GiB), and 40% less disk (318 GiB vs 501 GiB). Needle-in-haystack search: 900ms vs 12s.

Common Mistakes

✗Treating VictoriaLogs like Elasticsearch and designing complex index templates. VictoriaLogs is schemaless. It auto-indexes every field. Defining schemas adds no benefit and wastes setup time.
✗Expecting S3-native storage like Loki or Tempo. VictoriaLogs uses local disk. For cold/archive tiers, set up vmbackup to S3. Plan local NVMe/SSD capacity for the hot and warm retention window.
✗Not separating log streams by retention needs. If ERROR logs need 365-day retention but DEBUG logs only need 7 days, route them to separate vlstorage instances (or use stream-based routing at vlinsert) with different -retentionPeriod values. Storing everything at the longest retention wastes disk.
✗Running vlstorage on HDD instead of SSD. Bloom filter lookups and block scanning are I/O-intensive. HDDs bottleneck query performance. Use NVMe for hot data and SSD for warm data.
✗Ignoring the vlinsert buffer during Kafka consumer lag. If vlinsert cannot keep up with the Kafka ingestion rate, consumer lag grows. Monitor vlinsert ingestion rate and scale horizontally before lag compounds.

Related Technologies