Tile38
The real-time geospatial database built for tracking things that move
Use Cases
Architecture
Why This Exists
Redis has GEO commands (GEOADD, GEOSEARCH, GEODIST), and they work fine for simple use cases. But Redis GEO is built on sorted sets with geohash encoding. It handles points only, no polygons. There are no geo-fences. No enter/exit notifications. No support for GeoJSON or complex regions. For tracking a handful of delivery drivers with simple proximity queries, Redis GEO is enough. For anything more sophisticated, the limits show up fast.
PostGIS handles complex spatial queries beautifully, but it is a disk-based relational database. Updating 50,000 vehicle positions per second means each UPDATE creates a dead tuple that needs vacuuming. The write-heavy, update-heavy nature of real-time tracking fights against PostgreSQL's MVCC design. It works with careful tuning, but the design always fights back.
Tile38 fills the gap. It is an in-memory geospatial server built specifically for objects that move. Updates are in-place (no MVCC, no dead tuples, no vacuum). The R-tree index updates on every write. Geo-fences fire webhooks in real time. It speaks the Redis wire protocol so integration is trivial. For the specific use case of "track things that move and react when they enter/leave areas," Tile38 is the right tool.
How It Works
Tile38 organizes data into keys and IDs. A key is like a collection (e.g., "fleet" or "deliveries"). An ID is a unique identifier within that key (e.g., "truck-42"). Each object has a position (point, polygon, or GeoJSON feature) and optional metadata fields.
SET fleet truck-42 FIELD speed 65 FIELD fuel 0.82 POINT 33.5123 -112.2693
This sets the position of truck-42 in the "fleet" key, with speed and fuel level as metadata fields. The R-tree index is updated immediately.
The R-tree is the primary spatial index. Unlike PostGIS where indexes must be explicitly created, Tile38's R-tree is always active and always up to date. When a new position is SET for an object, the old entry is removed from the tree and the new one is inserted. This is an O(log n) operation.
Queries use the R-tree for fast spatial lookups:
NEARBY fleet POINT 33.51 -112.27 5000- find all objects in "fleet" within 5000 meters of the pointWITHIN fleet BOUNDS 33.0 -113.0 34.0 -112.0- find all objects within a bounding boxINTERSECTS fleet OBJECT {"type":"Polygon",...}- find all objects that intersect a GeoJSON polygon
Each query returns results with distance from the query point, making client-side sorting trivial.
Geo-Fencing Deep Dive
Geo-fencing is where Tile38 really separates itself. Define a fence on a key, and Tile38 continuously evaluates every position update against that fence.
SETHOOK warehouse-entry http://myapp.com/hooks/fence WITHIN fleet FENCE OBJECT {"type":"Polygon",...}
This creates a persistent geo-fence. Whenever any object in "fleet" enters, exits, or crosses the polygon boundary, Tile38 sends a POST request to the webhook URL with a JSON payload containing the object ID, its position, whether it entered or exited, and all metadata fields.
Under the hood, Tile38 maintains a list of active fences. On every SET command, it checks whether the object's old position and new position have different fence states. If the object was outside and is now inside, that is an "enter" event. If it was inside and is now outside, that is an "exit" event. If it crossed the boundary (was inside, moved outside, and back inside between two updates), that is a "cross" event.
The fence check is not brute force. Tile38 uses the R-tree to quickly identify which fences could possibly be affected by a position update (only fences whose bounding box overlaps with the object's movement). For a system with 1000 fences and 100,000 objects, most position updates only need to check 1-2 fences, not all 1000.
Webhook delivery is asynchronous. If the endpoint is slow or down, Tile38 buffers events in memory and retries. MQTT endpoints or gRPC streams are also available instead of HTTP webhooks for lower latency.
Production Architecture
For a fleet tracking system with 100,000 vehicles updating every 5 seconds:
Write throughput: 20,000 SET commands per second. Each SET updates the R-tree index and checks geo-fences. On a 4-core machine, Tile38 handles this comfortably with CPU utilization around 30%.
Memory: 100,000 objects with 5 metadata fields each uses about 100-200 MB. Add another 50 MB for the R-tree index and internal structures. A 4 GB machine has plenty of headroom.
Persistence: AOF writes every SET command to disk. At 20,000 commands/sec with ~100 bytes per command, the AOF grows at about 2 MB/sec or 170 GB/day. Configure AOF rewriting to compact the file periodically (Tile38 replays and rewrites the AOF, keeping only the latest state for each object). After rewriting, the AOF is roughly the size of the in-memory dataset.
Replication: Set up a follower for read scaling and as a warm standby. The follower receives the command stream from the leader and maintains its own R-tree. Route NEARBY and WITHIN queries to the follower to offload the leader. Route SET commands to the leader only.
Integration: Forward geo-fence events to Kafka or NATS for downstream processing. The webhook endpoint should be a lightweight receiver that publishes to the message queue, not a full processing pipeline. This decouples fence event generation from event processing and provides backpressure handling for free.
Capacity Planning
| Objects | Updates/sec | RAM | CPU Cores | AOF Size (daily) |
|---|---|---|---|---|
| 10,000 | 2,000 | 50 MB | 2 | 17 GB |
| 100,000 | 20,000 | 200 MB | 4 | 170 GB |
| 1,000,000 | 200,000 | 2 GB | 8 | 1.7 TB |
| 10,000,000 | 500,000 | 20 GB | 16 | 4.3 TB |
The bottleneck at 10 million objects shifts from CPU to memory and disk I/O. The R-tree at 10 million entries is still fast for queries (20-50 microseconds for NEARBY), but AOF writes at 500K/sec stress the disk. Use NVMe for the AOF at this scale. Also consider increasing the fsync interval from "every second" (default) to "every 5 seconds" if a slightly larger data loss window on crash is tolerable.
Failure Scenarios
Scenario 1: AOF grows unbounded and fills the disk. The team never configured AOF rewriting. After 30 days of tracking 100,000 vehicles, the AOF is 5 TB. The disk fills up and Tile38 stops accepting writes. All tracking goes dark. Detection: monitor disk usage and AOF file size. Alert when disk usage exceeds 70%. Recovery: trigger a manual AOF rewrite with AOFSHRINK. This compacts the file to only contain the current state, typically reducing it from terabytes to megabytes. Prevention: configure automatic AOF rewriting or run AOFSHRINK on a cron schedule.
Scenario 2: Webhook endpoint goes down and fence events pile up. The service receiving geo-fence webhooks crashes. Tile38 buffers pending webhooks in memory. With 50 active fences and 100,000 objects, events accumulate at thousands per second. After a few minutes, the buffer grows large enough to impact Tile38's memory usage and response times. Detection: monitor the webhook delivery queue size (Tile38 exposes this in SERVER output) and track webhook delivery latency. Recovery: restart the webhook endpoint. Tile38 replays buffered events. If the buffer is too large, consider discarding stale events (events older than 60 seconds are probably not actionable anyway). Prevention: use MQTT or gRPC for high-volume fence events instead of HTTP webhooks. They handle backpressure more gracefully.
Pros
- • Purpose-built for real-time geospatial data. Updates and queries on moving objects are first-class operations
- • Built-in geo-fencing with webhook notifications on enter/exit events
- • Redis-compatible protocol makes integration trivial for teams that know Redis
- • Supports points, polygons, GeoJSON, and geohashes natively
- • In-memory with persistence to disk (AOF), so reads are consistently fast
Cons
- • Single-node only. No built-in clustering or sharding
- • Dataset must fit in memory. Not suitable for historical data at scale
- • Smaller community and ecosystem compared to PostGIS or Redis with its GEO commands
- • No SQL interface. Query language is custom (RESP-based commands)
- • Limited analytical capabilities. It tracks objects, it does not analyze spatial patterns
When to use
- • You need to track millions of moving objects with sub-second update latency
- • Geo-fencing is a core requirement and you need real-time enter/exit events
- • Your workload is dominated by frequent location updates, not complex spatial queries
- • You want something simpler than PostGIS for real-time tracking
When NOT to use
- • You need complex spatial operations (polygon intersection, buffering, spatial joins)
- • Your data does not fit in memory or you need long-term spatial data storage
- • You need SQL and standard GIS tool compatibility
- • Analytics and aggregation over spatial data are more important than real-time tracking
Key Points
- •Tile38 uses an in-memory R-tree index that updates on every SET command. When a vehicle sends its new GPS coordinates, the R-tree is updated in microseconds. This makes Tile38 fundamentally different from PostGIS, where frequent updates create dead tuples and require vacuuming. Tile38 is designed for data that changes constantly.
- •Geo-fencing is built into the query layer, not bolted on. Define a fence with the SETCHAN or SETHOOK command, specifying a geographic region (circle, polygon, or bounding box) and a webhook URL. When any tracked object enters, exits, or crosses the fence boundary, Tile38 fires the webhook in real time. No polling, no batch processing, no external service needed.
- •The NEARBY command finds objects within a radius of a point, sorted by distance. WITHIN finds objects inside a polygon or bounding box. INTERSECTS finds objects that overlap a region. These three commands cover 90% of real-time spatial query needs. Each uses the R-tree index, so performance is O(log n + k) regardless of total object count.
- •Tile38 speaks the Redis RESP (REdis Serialization Protocol) wire protocol. Any Redis client library can connect. SET fleet truck1 POINT 33.5123 -112.2693 works like a Redis SET command but stores a geospatial point. This makes the learning curve minimal for teams already familiar with Redis.
- •Persistence uses an append-only file (AOF), similar to Redis. Every write command gets appended to the AOF. On restart, Tile38 replays the AOF to rebuild the in-memory state. Periodic snapshots are also an option. The tradeoff is the same as Redis: if the process crashes between the last fsync, you lose that data window.
- •Tile38 supports leader-follower replication. The follower connects to the leader and receives a continuous stream of commands. This provides read scaling and a warm standby for failover. It is not automatic failover though. An external tool is needed (like Sentinel-style monitoring) to promote the follower if the leader goes down.
Common Mistakes
- ✗Trying to use Tile38 as a general-purpose spatial database. It does not support polygon intersection between stored objects, spatial joins, or analytical aggregation. For questions like 'which delivery zones overlap with flood zones,' use PostGIS. Tile38 answers 'where is this truck right now' and 'did it just enter this zone.'
- ✗Not setting TTL on objects. If tracked devices go offline (phone battery dies, vehicle breaks down), their last known position stays in the index forever. Over time, dead objects accumulate and waste memory. Use the EX flag (SET fleet truck1 EX 3600 POINT ...) to auto-expire objects that have not been updated in the specified number of seconds.
- ✗Overloading webhooks with too many fence events. In a fleet of 100,000 vehicles with 50 active geo-fences, the system can generate millions of enter/exit events per hour. If the webhook endpoint cannot keep up, events queue up in memory and eventually cause backpressure. Use MQTT or gRPC endpoints for high-throughput fence events, and batch-process rather than handling each event individually.
- ✗Running without persistence and losing everything on restart. By default Tile38 enables AOF persistence, but some teams disable it for performance and forget to re-enable it. Without the AOF, a restart means every tracked object is gone and has to be re-reported. In a fleet tracking scenario, that means 30-60 seconds of no data until all devices check in again.
- ✗Ignoring memory limits. Each tracked object takes roughly 200-500 bytes in memory depending on the amount of metadata (fields). One million objects use about 200-500 MB. Ten million objects need 2-5 GB. Trying to track 100 million objects on a 16 GB machine is going to end badly. Plan memory based on peak object count, not average.