Redis Features — PE Trade-Offs Analysis

Every Redis capability you listed, analyzed at Principal Engineer depth: Pub/Sub, Streams, Task Queues, Atomic Counters, Sorted Sets, Transactions, Lua, Bloom Filters, Geospatial, Vector Sets, HyperLogLog, Caching, plus the four "role" uses (primary NoSQL DB, streaming engine, message broker, cache). The headline you asked for is the "When To Use Which Feature" decision matrix up top. Then the eight mandatory deep tables.

In-Memory Datastore Data Structures L6/L7 Depth

As of 2026-05-21 · Redis 8.x (GA Nov 2025, modules merged into core, Vector Sets native/beta, tri-licensed AGPLv3 / RSALv2 / SSPLv1). Behavior also applies to Valkey 8.x except Vector Sets.

PE Verdict — the one thing to internalize

Redis features are not interchangeable; each is a different data structure with a different cost curve, and choosing wrong shows up as either silent memory blowup or a blocked main thread. The recurring trap: people reach for Pub/Sub when they need durable delivery (use Streams), use Lists as a queue when they need consumer groups and acks (use Streams), and treat Redis as a primary database when its async replication and lossy failover make it a fast cache with persistence, not a system of record. Match the structure to the access pattern, and never forget every command runs on one shared thread.

Best default choices

StreamsDurable queues, replay, consumer groups, and at-least-once delivery Sorted SetsLeaderboards, priority queues, sliding windows, and rank/range access Atomic CountersRate limits, inventory decrement, IDs, and race-free increments CachingHot reads, TTLs, eviction policy, and backing-store offload

01. Overview — What Redis Is and Why It Exists

Redis (REmote DIctionary Server) is an in-memory data structure server, not a cache, not a database, not a queue, though it is used as all three. Created by Salvatore Sanfilippo in 2009 to solve a specific pain (a real-time web analytics dashboard hitting MySQL too hard), it has become the default substrate for sub-millisecond data access across the industry. The defining design choice: the server holds the data structures, not just the bytes, so clients push operations to the data instead of round-tripping for every step.

What it is (technical)

In-memory key/value store where values are rich data structures: strings, hashes, lists, sets, sorted sets, streams, bitmaps, HyperLogLogs, geospatial indexes, and (Redis 8) vectors and JSON.
Single-threaded command execution on an event loop (epoll/kqueue), with I/O threads (Redis 6+) parallelizing socket reads/writes and protocol parsing.
RESP protocol over TCP: a simple, human-readable, binary-safe text protocol designed for fast parsing and pipelining.
Optional durability via RDB snapshots, AOF append-log, or hybrid; optional HA via replication + Sentinel; optional scale via Redis Cluster (16384 hash slots).
Tri-licensed as of Redis 8 (May 2025): AGPLv3, RSALv2, or SSPLv1. Valkey is the BSD-licensed Linux Foundation fork from Redis 7.2.

What it is for

Caching in front of slower stores (the original and still-dominant use).
Session and ephemeral state at web scale: presence, rate limits, feature flags.
Real-time computation: leaderboards, counters, dedup, top-K, cardinality, geospatial proximity.
Lightweight messaging: Pub/Sub for broadcast, Streams for durable consumer groups.
Coordination primitives: distributed locks, atomic counters, sequence generators.
Vector similarity + RAG: Vector Sets and Redis Query Engine for AI/ML workloads alongside operational data.

Why it is needed (the problem it solves)

Disk-based databases are too slow for the read path of modern apps. A Postgres query at ~5–50 ms is fine for transactional writes but unacceptable for the 50–200 reads per user request that a modern web app makes (session lookup, rate check, feature flags, personalization, recommendations). Caching solved the latency side, but a flat key-value cache forces every multi-step operation to round-trip: read the score, increment it, write it back, with a race window in the middle. Concurrency bugs followed.

Redis collapses that round trip by moving the data structures into the server itself. "Increment this counter and check the result" is one network call (INCR) instead of three. "Add this member to a leaderboard and return its rank" is one call (ZADD + ZRANK). Because every command runs single-threaded against in-memory state, there is no race window, no lock to acquire, no transaction to coordinate. Atomicity is the byproduct of how the engine is built, not a feature you opt into.

The second problem Redis solves is that not all data fits the row-and-column model. A leaderboard is a sorted set. A presence list is a set. An event log is a stream. A "have we seen this device ID" check is a Bloom filter. Forcing these into SQL tables either wastes memory or makes queries slow. Redis exposes each as a first-class server-side type with operations matched to its shape.

The defining choice: single thread + RAM

Every other Redis property descends from this. Single-threaded execution means atomicity without locks. RAM means commands finish in microseconds, so the single thread is never idle and never starves clients. The two reinforce each other: a single-threaded disk database would be absurd; a single-threaded memory database is the simplest design that gives you both speed and correctness. Everything Redis "cannot do" (working set must fit RAM, one slow command blocks all clients, async replication can lose writes on failover) is a direct cost of this choice. Everything Redis is loved for (sub-ms p99, race-free counters, predictable behavior) is a direct benefit.

02. Core Concepts — The Mental Model

The vocabulary you need to speak Redis fluently in design review. Each concept here is something the engine actually does at runtime; understanding them is the difference between using Redis and reasoning about how it will behave under load.

Keyspace and databases

A Redis instance has a flat keyspace: every key is a string in a single global namespace. Logical separation comes from numbered databases (0-15 by default) via SELECT, but this is a legacy feature. Cluster mode supports only DB 0, and even in standalone mode the community guidance is to use key prefixes (user:123:cart) rather than separate DBs. The keyspace itself is a hash table with incremental rehashing: when load factor crosses a threshold, Redis allocates a new bigger table and moves keys gradually over many operations so no single command pays the full rehash cost.

The RESP protocol (Redis Serialization Protocol)

The wire format is deliberately simple: requests are arrays of bulk strings, responses are one of five types (simple string, error, integer, bulk string, array). It is text-based but binary-safe (bulk strings carry an explicit length). Why it matters: parsing cost is tiny, which lets the I/O threads handle millions of small messages with little CPU. RESP3 (Redis 6+) added attribute types and out-of-band push frames for client-side caching and richer pipelining.

The event loop and single-threaded execution

One thread, one event loop using epoll (Linux) or kqueue (BSD). On each iteration: poll ready file descriptors, read commands from ready client sockets, execute them one at a time against the in-memory data structures, write responses back. No two commands ever interleave. This is why INCR is atomic without a lock, why SETNX is the canonical distributed-lock acquire, and why a single long-running command (KEYS *, a huge ZRANGE, an unbounded Lua script) stalls every other client for its entire duration.

Objects, types, and encodings

Every value is wrapped in a robj (Redis object) header that carries its logical type (string, hash, list, etc.) and its physical encoding. Small collections use compact encodings — listpack for small hashes/lists/zsets, intset for small integer-only sets — that pack data into a single contiguous buffer for cache efficiency. When a structure grows past size/element thresholds it transparently upgrades to a full hash table or skip list. Knowing this is how you understand why "1000 small hashes" is dramatically cheaper than "1000 keys with the same fields."

Expiration: lazy + active

TTLs are stored in a separate expires dict keyed by the same key. Two reclamation mechanisms run together: passive (when a client accesses a key, check expiry then delete if past TTL) and active (a background cycle samples keys from the expires dict, deletes the expired ones, and if the sampled expired fraction is high, repeats). No per-key timers, no full keyspace scan. The cost: a key past its TTL but never read again sits in RAM until the active cycle reaches it; under heavy short-TTL workloads this lag can push you toward maxmemory.

Persistence: RDB, AOF, hybrid

RDB is a point-in-time snapshot written by a forked child (copy-on-write), small and fast to load, but loses everything since the last save. AOF is an append-only log of write commands, replayed on restart, with appendfsync tunable from always (per-write fsync, slow) to everysec (~1s loss window, the practical default) to no (OS-flushed). Hybrid (modern default for AOF rewrites) writes an RDB preamble plus AOF tail: fast restart with bounded loss. None of this gives you the durability of a true ACID database, even with appendfsync always, because failover is async-replicated and lossy.

Replication: leader-follower with PSYNC

Async replication. On first connect, the master forks an RDB and ships it; subsequent writes stream over the replication link. PSYNC (partial resync) lets a briefly-disconnected replica catch up from a circular replication backlog buffer on the master, avoiding a full RDB transfer on every network blip. Replicas are read-only by default and serve reads for scale, but they are eventually consistent: a write to the master is visible on replicas after the link latency, and on failover the unshipped tail is lost.

Sentinel vs Cluster

Two failover models. Sentinel is a separate quorum of monitor processes that watch a master-replica set, agree it has failed (avoiding one bad observer triggering a false failover), promote a replica, and reconfigure the others. Redis Cluster partitions the keyspace into 16384 hash slots distributed across master nodes, with replicas attached to each master; failover is handled by the cluster itself via gossip and master-quorum voting. Sentinel is simpler when one node holds the dataset; Cluster is required when you need to exceed one node's RAM or CPU.

Hash slots and the same-slot rule

In Cluster mode, slot = CRC16(key) mod 16384. Each master owns a contiguous range of slots. Multi-key commands (MGET, MULTI, Lua touching multiple keys) only work if all keys land in the same slot. The escape hatch is hash tags: only the substring between {} is hashed, so user:{42}:cart and user:{42}:profile co-locate. Designing your key space for co-location is a one-time cost that pays back forever; retrofitting it after cluster migration is painful.

Memory ceiling and eviction

maxmemory caps total RAM. When the cap is hit, the configured maxmemory-policy decides: reject writes (noeviction, correct for a primary store), evict approximate LRU/LFU across all keys (allkeys-*, correct for a cache), or evict only TTL-bearing keys (volatile-*, correct for mixed workloads). Eviction is approximate: Redis samples N keys (default 5) and evicts the best candidate from the sample, trading a little accuracy for not maintaining a globally sorted structure on the hot path.

03. Architecture — How the Pieces Fit

A Redis instance is a single process. One thread executes commands against in-memory structures. I/O threads parallelize the kernel network stack. A forked child handles persistence. Replicas receive a write stream. In Cluster mode, every node also runs a gossip protocol on a separate bus. This is the picture an L7 candidate should be able to draw in under two minutes.

Main Thread (red). The sacred one. All command execution, all mutations, all atomicity guarantees. One slow command here stalls every client. The entire engine is designed around keeping this thread fast and never blocked.

I/O Threads (blue). Added in Redis 6, expanded in 8. Parallelize the expensive kernel work (socket I/O, RESP parsing) while leaving command execution serial. This is how you get multi-core network throughput without losing single-thread atomicity.

Keyspace (gray). The in-RAM hash table holding all keys, with a parallel expires dict for TTLs. Small structures use compact listpack/intset encodings; large ones promote to full hash tables and skip lists. Memory is the ceiling.

Modules (teal, dashed). Once-separate Stack modules (Search, JSON, TimeSeries, Bloom, Cuckoo, TopK, CMS, TDigest, Vector Sets) are merged into the Redis 8 core as native data types. No more loadmodule directives for these features.

Fork child (teal). Persistence runs in a forked subprocess that sees a copy-on-write view of memory. The fork itself stalls the main thread proportional to heap size (page table copy), and COW can spike RSS during the save.

Replicas (purple). Async PSYNC stream. Partial resync from a circular backlog buffer avoids a full RDB transfer on every brief disconnect. Replicas are read-only; failover loses the unshipped tail.

Cluster Bus (gray). A separate TCP port for node-to-node gossip in Cluster mode: PING/PONG carrying topology, slot ownership, and failure suspicions. Failure detection by majority vote, no central coordinator.

Failover layer (dashed). Either Sentinel (standalone topologies) or the cluster's own gossip + master-quorum vote (cluster mode). Detects master failure, promotes a replica, reconfigures clients.

The architectural principle to internalize: the main thread is sacred, and everything that can be moved off it has been. I/O off the main thread (I/O threads). Persistence serialization off the main thread (forked child). Big-key freeing off the main thread (lazy free, UNLINK). Cluster topology gossip off the main thread (separate bus). Any time you propose a feature that would do work on the main thread, the Redis design instinct is to ask "can this be deferred, sampled, or moved to a background helper?" That instinct is what keeps p99 in microseconds.

04. Features — The Thirteen That Define Redis

Each feature explained as a PE would explain it on a whiteboard: what it is for (the access pattern it serves), why it is needed (the problem it solves that no plain key-value can), a real command example, and the gotcha that separates "I read the docs" from "I have run this in production."

01In-Memory StorageThe foundation

What it is for

Holding the entire working dataset in RAM, so every read and write completes in microseconds without touching disk. The dataset is structured by type (string, hash, list, set, zset, stream, etc.) and accessed through type-specific commands, not generic blob reads.

Why it is needed

Disk I/O is the latency wall. A spinning disk is milliseconds per random seek; even NVMe is tens of microseconds. RAM is hundreds of nanoseconds. For workloads that read the same data many times per request (sessions, rate limits, feature flags, leaderboards, top-K lookups), holding the working set in RAM is the only path to sub-ms p99. The trade is real: working set must fit in memory, and RAM is roughly 100× the cost of disk per byte, so you size aggressively and lean on eviction or sharding when the set outgrows one node.

# sub-millisecond point reads
SET user:42:session "<token>" EX 3600.      # 1 hour TTL
GET user:42:session                       # ~50µs round-trip

# inspect memory layout
MEMORY USAGE user:42:session
INFO memory                              # used_memory, fragmentation

PE nuance

Per-key overhead is real: ~50–90 bytes of metadata (dict entry, robj header, SDS header, expire entry) before your value. A billion tiny keys is ~80 GB of overhead alone. The PE move is to model data as a few large structures (hashes, zsets, streams) instead of many tiny keys; small ones get the listpack/intset compact encoding and pay almost no overhead.

02Flexible PersistenceRDB · AOF · Hybrid

What it is for

Surviving a process restart or crash without losing the working set. Three modes you mix to taste: RDB (periodic point-in-time snapshot), AOF (append-only command log), and hybrid (RDB preamble + AOF tail, the modern default during AOF rewrites).

Why it is needed

RAM is volatile. Without persistence, a restart starts from an empty dataset, which means a cold cache stampede on your backing store and minutes-to-hours of pain for sessions, rate limits, and any "ephemeral" data your app secretly depends on. Persistence is what turns Redis from "a cache that forgets" into "a cache that recovers." Critically, this is disaster recovery, not durability: even with appendfsync always, async replication can lose unshipped writes on failover.

# RDB: snapshot if 1000+ keys changed in 60s
save 60.  1000. 
save 300.  10. 

# AOF: append every write, fsync once per second
appendonly yes
appendfsync everysec        # sane default; ~1s loss window

# Modern hybrid rewrite (RDB preamble + AOF tail)
aof-use-rdb-preamble yes

# Trigger ops
BGSAVE                          # non-blocking RDB
BGREWRITEAOF                    # compact AOF
LASTSAVE                        # unix timestamp of last save

PE nuance

Both RDB and AOF rewrite call fork(). Copy-on-write means the child sees a frozen view, but on a large heap under heavy writes COW can balloon RSS by tens of GB during the save, and the fork itself stalls the main thread for ms-to-seconds (page table copy). On a memory-pressured box, the fork can trigger the OOM killer during what was supposed to be a safety operation. Watch latest_fork_usec, leave RAM headroom, and prefer running saves on a replica.

03Diverse Data StructuresThe product

What it is for

Server-side data types matched to the shape of the problem: strings for blobs and counters, hashes for objects, lists for queues and timelines, sets for unique membership, sorted sets for ranking, streams for event logs, plus bitmaps, HyperLogLogs, geospatial indexes, and (Redis 8) JSON, TimeSeries, and Vector Sets. Commands are matched to each type's natural operations.

Why it is needed

A plain key-value store forces every multi-step operation through the network. "Add to leaderboard and return rank" becomes read-update-write, with a race window. "Get top-10 nearby drivers" becomes scan-all-then-sort in the client. By moving the structure into the server, Redis turns these into single atomic commands. This is why Redis is called a data structure server, not a cache. The structures are the product; the in-memory storage is just what makes them fast.

# Hash: store a user object compactly
HSET user:42 name "Alice" tier "gold" joined 1717. 
HGETALL user:42

# Sorted Set: real-time leaderboard
ZADD leaderboard 9500.  alice 8200.  bob 9800.  carol
ZREVRANGE leaderboard 0.  9.  WITHSCORES   # top 10
ZRANK leaderboard alice                  # instant rank

# Stream: durable event log with consumer groups
XADD events * type purchase user 42.  amount 99. 
XGROUP CREATE events analytics $ MKSTREAM
XREADGROUP GROUP analytics worker-1 COUNT 10.  STREAMS events >

PE nuance

Each type has a small-size encoding (listpack, intset) that flips to a "real" structure past a threshold. Tune those thresholds (hash-max-listpack-entries, zset-max-listpack-entries) for memory efficiency. The other instinct to develop: composition. Real PE designs combine 2-3 structures (a Geo set for location, a Stream for the event log, a Sorted Set for ranking) rather than forcing one structure to do everything.

04Atomic OperationsFree correctness

What it is for

Every single Redis command runs atomically against the keyspace. INCR is a race-free read-modify-write. SETNX is an atomic acquire-if-absent. ZADD is an atomic insert-or-update. No client-side locking, no compare-and-swap loop, no transaction overhead for single-key ops.

Why it is needed

The hardest bugs in distributed systems are races on shared counters and shared flags: two clients reading the same value, both incrementing locally, both writing back, one increment lost. The fix in a normal database is a row lock or an explicit CAS retry loop, both of which add latency and can deadlock. Redis sidesteps all of it because the single-threaded engine serializes every command. Atomicity is the byproduct of the architecture, not a feature you opt into. This is why Redis is the default rate-limiter, sequence generator, and distributed-lock primitive across the industry.

# atomic counter — no lock, no race
INCR page:home:views                  # 1
INCR page:home:views                  # 2
INCRBY wallet:42 -150                # atomic debit

# distributed lock acquire (Redlock-style single-instance)
SET lock:order:99 "<uuid>" NX EX 10.       # OK if not held

# atomic conditional update via ZADD flags
ZADD leaderboard GT 9500.  alice         # only update if higher

PE nuance

The single-thread serialization is per-instance, not global. A single hot key (a viral counter, a celebrity leaderboard) cannot be spread across cluster nodes; it concentrates load on one shard's single thread. Mitigation: shard hot counters into N sub-keys client-side (counter:{0..15}), sum on read. Trades read cost for write spread.

05High AvailabilityReplication · Sentinel · Cluster

What it is for

Surviving a node failure with bounded outage and bounded data loss. Three layers: asynchronous replication from master to one or more replicas (read scaling + warm standby), Sentinel for automatic failover in non-cluster topologies (quorum of monitors agrees on failure, promotes a replica), and Redis Cluster's built-in failover via gossip and master-quorum voting in sharded deployments.

Why it is needed

A single Redis instance is one process on one box. Process crash, box reboot, network partition, AZ failure, any of these and the dataset is unavailable for the duration. Modern systems demand seconds-level RTO. Replication gives you a warm copy to promote; Sentinel or Cluster makes the promotion automatic so a human is not the bottleneck at 3am. The honest framing: this gets you availability, not durability. Async replication means a promoted replica may be behind the dead master, so acknowledged writes can be lost on failover.

# replica config
replicaof master.internal 6379. 
replica-read-only yes

# refuse writes if too few replicas connected (shrink loss window)
min-replicas-to-write 1. 
min-replicas-max-lag 10.            # seconds

# synchronous-ish: block until N replicas ack
SET critical:key "value"
WAIT 2.  100.                        # wait up to 100ms for 2 replicas

# Sentinel monitors a master
sentinel monitor mymaster 10.0.0.5 6379.  2.    # quorum=2
sentinel down-after-milliseconds mymaster 5000.

PE nuance

Size the replication backlog buffer for realistic disconnect durations. The default is small (1 MB), and a replica restart that takes longer than the backlog fits triggers a full resync: the master forks a new RDB and ships the whole dataset. On a busy 50 GB instance, this is a multi-minute event that spikes load right when you can least afford it. Set repl-backlog-size to cover your worst-case disconnect, typically tens to hundreds of MB.

06Horizontal ScalabilityRedis Cluster · 16384 slots

What it is for

Scaling beyond one node's RAM and one core's command throughput. Redis Cluster partitions the keyspace into a fixed 16384 hash slots and distributes them across master nodes. Each slot is a contiguous integer range owned by exactly one master, with replicas attached. Clients are cluster-aware and route directly to the owning node.

Why it is needed

A single Redis instance caps at the largest box you can buy: tens to hundreds of GB of RAM and one core for command logic. Past that, you must shard. Cluster gives you online resharding (move slots between nodes without downtime), automatic failover per shard, and partition tolerance per shard. The fixed 16384-slot count is a deliberate compromise: small enough to gossip a slot ownership bitmap cheaply (~2 KB per heartbeat), large enough to balance finely across hundreds of nodes.

# slot = CRC16(key) mod 16384
CLUSTER KEYSLOT user:42                # → e.g. 12539
CLUSTER NODES                          # topology + ownership
CLUSTER COUNTKEYSINSLOT 12539. 

# Hash tags force co-location into the same slot
MSET user:{42}:name Alice user:{42}:tier gold
# both keys hash on {42} → same slot → MGET/MULTI work

MGET user:{42}:name user:{42}:tier      # OK
MGET user:42:name user:42:tier          # may fail with CROSSSLOT

# Online slot migration (run by orchestrator)
CLUSTER SETSLOT 12539.  MIGRATING <target-node>
CLUSTER SETSLOT 12539.  IMPORTING <source-node>

PE nuance

Cluster gives you more capacity, not magic. A single hot key still pins one shard. Design the key space for co-location up front using hash tags for keys that need to be touched together; retrofitting after cluster migration means rewriting client code and migrating data. And know that some multi-key features (transactions, Lua across keys) require same-slot keys, so cluster mode imposes data-modeling discipline you do not pay on a single instance.

07Pub/Sub MessagingAt-most-once broadcast

What it is for

Real-time broadcast of messages to all currently-connected subscribers of a channel. Publishers send to channels; subscribers receive everything published while their connection is open. Patterns (PSUBSCRIBE) allow wildcard subscriptions. Sharded Pub/Sub (Redis 7+) spreads channels across cluster shards by hash slot for fan-out beyond one node.

Why it is needed

Two classic patterns demand instant fan-out with zero storage: cache invalidation (a write happens, broadcast "key X changed" so every app instance can drop its local copy) and live notifications ("user is typing," presence updates, real-time dashboards). A persistent queue is overkill here because the consumers either receive the message live or it was never relevant. Pub/Sub gives you microsecond fan-out with effectively zero memory cost since nothing is stored.

# subscriber holds a long-lived connection
SUBSCRIBE cache:invalidate

# publisher
PUBLISH cache:invalidate "user:42"     # returns # subscribers reached

# pattern subscription
PSUBSCRIBE news.*                       # news.sports, news.tech, ...

# Sharded Pub/Sub (cluster mode, scales fan-out across shards)
SSUBSCRIBE orders:{shard1}
SPUBLISH orders:{shard1} "<event>"

PE nuance

Pub/Sub is at-most-once. If a subscriber is disconnected, slow, or its output buffer fills, messages are lost forever. There is no per-subscriber buffer, no replay, no offset. The trap teams fall into: using Pub/Sub for anything where missing a message is a correctness bug. The instant you say "but what if the consumer was down," switch to Streams.

08Lua Scripting & FunctionsServer-side atomic compute

What it is for

Running multi-step read-compute-write logic atomically on the server in one round trip. EVAL runs an inline Lua script; SCRIPT LOAD+EVALSHA caches it; Functions (Redis 7+) register named, versioned, persistent script libraries that survive restarts and replicate naturally.

Why it is needed

MULTI/EXEC is isolation but no logic: it cannot read a value and decide what to write next. Application-side "read then write" has a race window. Lua collapses both: the entire script runs atomically on the main thread, with full access to read intermediate values and branch on them. This is how you implement correct distributed locks (release only if the token matches), conditional rate limiters (reset on hour boundary, decrement counter, reject if zero), and atomic transfer-with-balance-check, all in one network call.

# Atomic decrement-if-positive (rate limit)
EVAL "local v = redis.call('GET', KEYS[1])
       if not v or tonumber(v) <= 0 then return 0 end
       return redis.call('DECR', KEYS[1])" 1.  quota:user:42

# Cached: load once, invoke by sha
SCRIPT LOAD "<same script>"            # → sha1
EVALSHA <sha1> 1.  quota:user:42

# Functions: registered, named, versioned
FUNCTION LOAD "#!lua name=mylib
       redis.register_function('decr_if_pos', function(keys, args)
         local v = redis.call('GET', keys[1])
         if not v or tonumber(v) <= 0 then return 0 end
         return redis.call('DECR', keys[1])
       end)"
FCALL decr_if_pos 1.  quota:user:42

PE nuance

A Lua script blocks the entire server for its full duration. A script that loops over a growing collection, calls into a slow command, or has an accidental infinite loop will freeze every client. Keep scripts short and O(small). Make them deterministic (no math.random without seeding, no system time without passing it in) or replication and AOF replay will diverge. Prefer Functions over inline EVAL for anything you maintain: named, versioned, easier to operate.

09Geospatial ProcessingGEO commands on ZSET

What it is for

Storing 2D points (lon, lat) keyed by name, and answering radius and bounding-box queries fast: "find all drivers within 5 km of this rider," "show stores in this map viewport," "rank nearby restaurants by distance." Backed under the hood by a sorted set keyed on a 52-bit geohash, which is why range queries are O(log N).

Why it is needed

Doing geo proximity in a generic database means either a quadtree/R-tree extension (PostGIS), a brute-force distance calculation over every row (does not scale), or a custom spatial index you maintain yourself. Redis bakes a good-enough spatial index into a familiar primitive (the sorted set) and exposes it through simple commands. For the very common case of "points and circles," this beats running a separate GIS database and the network hop to reach it.

# add drivers' current locations
GEOADD drivers -122.4194 37.7749 driver:1
GEOADD drivers -122.4783 37.8199 driver:2

# find drivers within 3km of a rider (modern unified command)
GEOSEARCH drivers FROMLONLAT -122.42 37.78 BYRADIUS 3.  km ASC COUNT 10. 

# bounding box (rectangle in a map viewport)
GEOSEARCH drivers FROMLONLAT -122.42 37.78 BYBOX 5.  5.  km ASC

# distance and position lookups
GEODIST drivers driver:1 driver:2 km    # km between two members
GEOPOS drivers driver:1                 # back to lon/lat

PE nuance

It is a sorted set with geohash scores; every ZSET property applies (per-member memory cost, O(log N) operations, no polygon support). For "point in circle" and "point in box" it is excellent. For polygons, route networks, full GIS, or millions of moving points with complex filters, you want PostGIS or a dedicated geo store. Updating frequently-moving points (ride-hail drivers every few seconds) is fine; the underlying ZSET handles repeated GEOADD as score updates.

10Pluggable ModulesNative types in Redis 8

What it is for

Extending Redis with new data types and commands implemented as dynamically-loaded shared libraries. Historically the way Redis Stack delivered Search, JSON, TimeSeries, Bloom/Cuckoo/TopK/CMS/TDigest, and Graph as separate modules. In Redis 8 (GA Nov 2025), these were merged into the open-source core as native data structures — no more loadmodule directives needed for them.

Why it is needed

The core Redis API can model many problems, but not all. Full-text search needs an inverted index. Semantic search needs an HNSW vector index. Time-series needs compressed downsampled storage. Probabilistic types need their own backing layout. Modules let these live inside Redis with first-class command sets, instead of forcing a separate system + network hop + operational team. The Redis 8 consolidation effectively says: these capabilities are core enough that everyone should have them by default.

# JSON (native in Redis 8)
JSON.SET user:42 $ '{"name":"Alice","tags":["gold","vip"]}'
JSON.GET user:42 $.tags                   # ["gold","vip"]
JSON.ARRAPPEND user:42 $.tags '"founder"'

# Time-series (native in Redis 8)
TS.CREATE temp:room:1 RETENTION 86400000.      # 24h
TS.ADD temp:room:1 * 21.4
TS.RANGE temp:room:1 - + AGGREGATION avg 60000.    # 1-min avg

# Search + Query Engine (native in Redis 8)
FT.CREATE idx:users ON HASH PREFIX 1.  user: SCHEMA name TEXT tier TAG
FT.SEARCH idx:users "@tier:{gold}" LIMIT 0.  10. 

# If running older Redis with modules
loadmodule /path/to/redisearch.so       # pre-Redis 8

PE nuance

Modules run inside the Redis process and share its single thread, so a buggy or slow module command stalls everyone. Vet third-party modules carefully and watch their SLOWLOG behavior. The Redis 8 merge is a net positive — vendor lock-in via "you need Redis Stack" is gone, the modules are now AGPLv3 — but verify your deployment surface (Valkey, for instance, does not ship these; you would use the standalone module repos or stay on Redis).

11Smart Cache Eviction8 policies · approximate LRU/LFU

What it is for

Defining what happens when memory hits maxmemory. Eight policies: noeviction (reject writes), allkeys-lru / allkeys-lfu / allkeys-random (evict from any key), volatile-lru / volatile-lfu / volatile-random / volatile-ttl (evict only TTL-bearing keys). The volatile family lets you mix "evictable cache" keys (with TTL) and "protected persistent" keys (no TTL) in one instance.

Why it is needed

RAM is finite. Without a defined ceiling behavior, an unbounded write workload will eventually OOM the process, taking everything down. Eviction turns that catastrophe into a graceful behavior you chose: silently drop cold cache entries (LRU/LFU for caches), refuse new writes (noeviction for a primary store), or selectively expire (volatile-* for mixed). It is the difference between a cache that degrades gracefully under pressure and one that falls over.

# cap memory and pick a policy
maxmemory 10gb
maxmemory-policy allkeys-lru        # pure cache
maxmemory-samples 10.                 # better accuracy, more CPU

# LFU mode tracks access frequency over time
maxmemory-policy allkeys-lfu
lfu-log-factor 10.                   # counter compression
lfu-decay-time 1.                    # minutes between decays

# mixed workload — evict only keys with TTL set
maxmemory-policy volatile-lru
SET session:abc "<data>" EX 3600.       # evictable
SET config:limits "<data>"             # protected (no TTL)

# inspect what's happening
INFO stats                          # evicted_keys, keyspace_misses
CONFIG GET maxmemory-policy

PE nuance

LRU/LFU are approximate, not exact. Redis samples N keys (default 5) and evicts the best candidate from the sample. True LRU would cost memory and main-thread time on every access. Raising maxmemory-samples to 10 gives near-true LRU at modest CPU cost; this is the right tuning for most caches. And: noeviction with a writing client = errors. If you set noeviction and the cap is hit, your app starts getting OOM errors from Redis. Match the policy to the use case explicitly.

12Transactions (MULTI/EXEC/WATCH)Isolation, not ACID

What it is for

Grouping multiple commands so they execute as one atomic batch with no other client's commands interleaved. MULTI opens the batch, queued commands accumulate, EXEC runs them all in order. WATCH adds optimistic concurrency: if any watched key changes between WATCH and EXEC, the EXEC aborts with nil, letting the client retry (check-and-set).

Why it is needed

Even with single-threaded execution, a sequence of commands sent by one client can be interleaved with commands from other clients. If you read a balance, compute a new value, and write it back as three separate commands, another client can write between your read and your write. MULTI/EXEC closes that gap by deferring execution until all the commands have arrived and running them as one indivisible block. WATCH adds the conditional layer for true read-then-write atomicity.

# Optimistic transfer: check balance, then debit + credit atomically
WATCH account:42
GET account:42                         # >= 150?
MULTI
DECRBY account:42 150. 
INCRBY account:99 150. 
EXEC                                  # nil if account:42 changed since WATCH

# Cancel a pending transaction
MULTI
SET foo bar
DISCARD                               # clears the queue

PE nuance

This is the single most misunderstood Redis feature. MULTI/EXEC has no rollback. If command 3 of 4 fails at runtime (e.g., type error: INCR on a list), commands 1, 2, and 4 still execute. "Atomic" here means "no interleaving," not "all-or-nothing." For true conditional atomic logic, use Lua or Functions — the script can read, branch, and write with no race window and abort cleanly. Use WATCH+MULTI only for optimistic CAS, not for rollback semantics you do not have.

13Bitmaps and HyperLogLogsMassive memory savings

What it is for

Two specialized space-saving structures. Bitmaps are strings treated as bit arrays for exact-membership flags ("did user N visit on day D?"). HyperLogLog (HLL) is a probabilistic cardinality counter: approximate "how many distinct items have I seen" in fixed ~12 KB regardless of true cardinality, with ~0.81% standard error and mergeable across keys.

Why it is needed

The naive way to track "which of 100M users visited today" is a Set of user IDs, which costs gigabytes per day. A Bitmap keyed by user ID stores one bit per user — 12.5 MB for 100M users, regardless of how many actually visited — and gives you exact answers via BITCOUNT and BITOP AND/OR/XOR across days. The naive way to count "how many unique users searched this term across a billion events" is a Set of IDs, which is gigabytes. HLL gives you the count in 12 KB with sub-1% error, and you can PFMERGE across time windows or shards trivially. The right tool when you need answers, not members.

# Bitmap: per-user daily-active flags
SETBIT dau:2026-06-07 42.  1.              # user 42 active today
GETBIT dau:2026-06-07 42. 
BITCOUNT dau:2026-06-07                # exact active-user count

# 7-day retention: bitwise AND across day bitmaps
BITOP AND dau:7day dau:2026-06-01 dau:2026-06-02 ... dau:2026-06-07
BITCOUNT dau:7day                      # users active every day

# HyperLogLog: approximate distinct count, ~12 KB fixed
PFADD uniq:visitors:2026-06-07 user-42 user-99 user-7
PFCOUNT uniq:visitors:2026-06-07       # approx cardinality

# Merge across days (or shards) without losing accuracy
PFMERGE uniq:visitors:weekly uniq:visitors:2026-06-01 uniq:visitors:2026-06-02 ...
PFCOUNT uniq:visitors:weekly

PE nuance

Bitmaps are exact but require a dense ID space (user IDs as small integers). If your IDs are UUIDs you must maintain a separate "UUID → small int" map. HLL is approximate and gives you a count, never the members. The trap is using HLL then later being asked "which users were unique" — that data is gone by design. Decide up front whether you will ever need enumeration; if maybe, keep an exact structure alongside, which defeats the memory savings.

05. When To Use Which Redis Feature (the headline)

The decision table you actually came for. For each feature: the access pattern it is built for, what to pick instead when it does not fit, and the PE verdict on where the line is. Sorted by how often teams misuse it.

Feature	Reach For It When	Do NOT Use It When (use instead)	Cost / Complexity (Big-O + memory)	PE Verdict
Pub/Sub	Fire-and-forget fan-out to currently-connected subscribers: live notifications, cache-invalidation broadcast, presence	You need durability, replay, or guaranteed delivery → Streams. Offline subscribers must get missed messages → Streams or Kafka	O(N) per publish over N subscribers; near-zero memory (nothing stored)	It is a broadcast bus with at-most-once delivery. If a subscriber is down or slow, the message is gone. Default to Streams unless loss is genuinely acceptable.
Lists as Task Queue	Simple producer/consumer work queue, one worker pool, loss-on-crash tolerable: `LPUSH` + `BRPOP`	You need acks, retries, consumer groups, dead-letter, or replay → Streams. You need delayed/scheduled jobs → Sorted Set by timestamp	O(1) push/pop; memory = queue depth × item size	Lists are a great simple queue. The moment you say "but what if the worker crashes mid-job," you have outgrown them. Streams add the reliability layer; do not bolt acks onto Lists by hand.
Streams	Durable, replayable event log with consumer groups, per-message acks, and at-least-once delivery: event sourcing, reliable job queues, activity feeds	You need petabyte retention, long-term log storage, or massive multi-DC throughput → Kafka. You only need ephemeral broadcast → Pub/Sub	O(1) append, O(log N) range; memory grows with retention, must cap via `MAXLEN`/`MINID`	The most under-used Redis feature and the right answer to most "I'm using Lists/Pub-Sub for messaging" problems. It is Kafka-lite: same mental model (offsets, groups, acks) at far smaller scale and far lower ops cost.
Atomic Counters	High-throughput counts with no race conditions: rate limiting, view counts, inventory decrement, ID generation (`INCR`/`INCRBY`)	You need exact counts across billions of distinct items cheaply → HyperLogLog (for cardinality). You need per-key counts that must survive any loss → durable DB	O(1); 8 bytes per counter value (plus key overhead)	The canonical "Redis is atomic for free" win. One thread means `INCR` needs no lock. This is why Redis is the default rate-limiter and sequence generator.
Sorted Sets (ZSET)	Anything ranked or range-queried by a score: leaderboards, priority queues, time-series windows, rate limiting (sliding window via score=timestamp), secondary indexes	You just need membership → Set. You need full-text or multi-field query → Redis Query Engine	O(log N) add/rank via skip list; ~64+ bytes per member overhead	The Swiss-army structure. If a problem has "top N," "by rank," "within score range," or "expire oldest," it is a ZSET. Also the substrate Vector Sets were built on.
Transactions (MULTI/EXEC)	Grouping commands to run with no other command interleaved, with optimistic concurrency via `WATCH` (check-and-set)	You need rollback on logic error → there is none, use Lua for conditional logic. You need cross-shard atomicity → not supported in cluster	O(sum of queued commands); buffers commands until EXEC	Misnamed. No rollback: if command 3 fails, 1, 2, 4 still execute. It is "isolation + batching," not ACID transactions. For real conditional atomic logic, reach for Lua or Functions.
Lua Scripting / Functions	Multi-step read-compute-write that must be atomic and avoid round trips: conditional updates, complex rate limiters, atomic check-and-act	The script is long-running or loops over huge data → blocks the whole server. You need maintainable logic → keep it tiny	Runs atomically on the main thread; O(script). Blocks all clients for its duration	Powerful and dangerous: a Lua script is the easiest way to stall the entire server. Keep scripts short, deterministic, and O(small). Prefer Functions (named, versioned) over inline EVAL for anything you maintain.
Caching	Reducing load on a slower backing store, sub-ms reads on hot data, with TTL and an eviction policy (`allkeys-lru`)	The data is the only copy → that is a primary store, set `noeviction` and accept Redis's durability limits, or use a real DB	O(1) get/set; memory bounded by maxmemory + eviction	The original and still dominant use. The trap is forgetting the failure modes: stampede on expiry (add jitter), eviction storms near the ceiling, and treating a cache miss as an outage when it should be a fallback.
Bloom Filter	Probabilistic "have I seen this?" at massive scale with tiny memory and tolerable false positives: dedup, "already shown," cache-penetration guard	You cannot tolerate false positives → use a Set (costs real memory). You need to remove items → Bloom can't delete (use Cuckoo filter)	O(k) per op; bits ≈ -n·ln(p)/(ln2)², e.g. ~1.2 KB for 1000 items at 1% FPR	The right tool when "definitely not seen" is cheap and valuable and an occasional false "seen" is harmless. A Set storing 100M IDs is gigabytes; a Bloom filter is megabytes. Know it cannot delete and false-positive rate rises as it fills.
Geospatial	Radius/box queries over points: "nearby drivers," "stores within 5km," proximity ranking (`GEOADD`/`GEOSEARCH`)	You need polygons, routing, or true GIS → PostGIS. Millions of points with complex filters → dedicated geo DB	O(log N) add, O(N+log M) radius; built on a ZSET with geohash scores	It is a ZSET with geohash-encoded scores, which is why it is fast and why it inherits ZSET memory cost. Great for point-in-radius at moderate scale; not a GIS replacement.
Vector Sets beta	Native vector similarity (KNN) inside Redis for recommendations and semantic search, when you already run Redis and want one fewer system (`VADD`/`VSIM`)	You need a battle-tested, billion-vector ANN store with rich filtering → dedicated vector DB or Redis Query Engine. API stability matters → it is beta	HNSW index, O(log N)-ish search; memory heavy (vectors + graph), quantization (int8) helps	New in Redis 8 (antirez), inspired by sorted sets. Compelling for "I already have Redis, add similarity search." For serious vector workloads at scale, the Redis Query Engine or a purpose-built vector DB is the safer call until Vector Sets exit beta.
HyperLogLog	Approximate distinct-count of huge sets in fixed ~12 KB: unique visitors, distinct search terms, cardinality across billions (`PFADD`/`PFCOUNT`)	You need exact counts → Set or counter. You need the actual members → it stores none	O(1) add/count; fixed ~12 KB regardless of cardinality; ~0.81% standard error	Magical memory profile: count a billion uniques in 12 KB with sub-1% error. The catch people miss: it gives you a number, never the members, and merges (`PFMERGE`) are how you do unions. Wrong tool the instant someone asks "which users."
Geo + Streams + ZSET combo	(pattern) Real-time location feed with replay: GEO for position, Stream for the event log, ZSET for ranking	N/A — composition pattern, not a single feature	Sum of the parts	The PE move is composing primitives, not forcing one structure to do everything. Most sophisticated Redis designs are 2-3 structures wired together.

Role-level: which "Redis-as-X" claim actually holds up

Redis as…	Holds up when	Breaks when	PE Verdict
Cache	Always. This is the home-turf use case	Rarely; only via misconfig (no eviction policy, no TTL jitter)	Unambiguous yes. Everything else is "Redis can also…"
Primary NoSQL DB	Working set fits in RAM, you accept async-replication loss windows, persistence is DR not a guarantee, and the data is not money	You need durability guarantees, the dataset exceeds RAM economically, or a dropped write is a correctness bug	Conditional. Viable for session stores, ephemeral state, feature flags. Dangerous for ledgers/orders. Say the durability caveat out loud.
Streaming engine	Moderate throughput, short-to-medium retention, you want consumer groups without running Kafka, latency matters more than retention	You need long retention, replay over weeks, multi-DC at high volume, or exactly-once across a big pipeline	Yes for "Kafka-lite." Redis Streams covers a huge middle ground at a fraction of Kafka's operational weight. It is not Kafka at Kafka scale.
Message broker	Low-latency in-process messaging, simple routing, you accept Redis delivery semantics (Pub/Sub at-most-once, Streams at-least-once)	You need complex routing, transactions across queues, or AMQP guarantees → RabbitMQ. Massive durable log → Kafka	Fine for lightweight messaging. For rich broker semantics (exchanges, complex routing, transactional messaging), a real broker wins.

The honest hierarchy: Redis is a cache that grew the ability to also be a fast ephemeral database, a lightweight stream engine, and a simple broker. Each "also" comes with a caveat a PE states up front, not after the incident.

06. Trade-Offs (what you give up to get what)

A trade-off is a deliberate exchange: gain X, surrender Y, and it bites at a specific moment. Grouped by the feature families you listed. These are the ones that matter on call and in design review.

What you gain What you give up

Feature	Trade-Off	What You Gain	What You Give Up	When It Bites	PE Nuance
Pub/Sub	No persistence for delivery speed	Lowest-latency fan-out, zero storage cost	At-most-once: any disconnected/slow subscriber loses messages forever	A subscriber restarts during a deploy and silently misses every message in that window	Redis does not buffer per-subscriber. A slow consumer's output buffer fills and the server may drop it. Sharded Pub/Sub (cluster) helps fan-out but not durability.
Streams	Durability + replay for bounded memory growth	At-least-once, consumer groups, acks, replay from any offset	Memory grows until you cap it; consumers can double-process on redelivery	You forget `MAXLEN` and the stream silently eats RAM until eviction or OOM	At-least-once means consumers must be idempotent. Pending Entries List (PEL) tracks unacked; you must `XCLAIM` abandoned messages or they sit forever.
Lists (queue)	Simplicity for reliability	Trivial O(1) queue with blocking pop	No ack/retry: a crash between pop and process loses the job	Worker OOMs after `BRPOP` but before finishing; the job is gone with no trace	`BLMOVE` into a processing list gives crash-safety by hand, but at that point Streams give it for free with groups and acks.
Atomic Counter	Lock-free correctness for single-node bound	Race-free increment with no client coordination	Throughput on one key is capped by the single thread; cross-shard counters need merging	One ultra-hot counter (viral post) becomes a single-key hot spot that cluster can't spread	Shard a hot counter into N sub-keys (`counter:{0..N}`), sum on read. Trades read cost for write spread.
Sorted Set	Ordering for memory and op cost	O(log N) ranked queries, range-by-score, rank lookup	High per-member memory (skip list + dict), O(log N) not O(1)	A leaderboard of 50M members balloons RAM; `ZRANGE` over huge ranges blocks the loop	Always bound the range you fetch. Use `ZADD GT/LT` for conditional score updates instead of read-then-write.
Transactions (MULTI)	Isolation without locks, but no rollback	Commands run with nothing interleaved; `WATCH` gives optimistic CAS	No atomic rollback: a command failing mid-batch does not undo prior ones	Command 2 of 4 errors; commands 1, 3, 4 still applied, leaving partial state	This is the single most misunderstood Redis feature. Use Lua for true conditional atomicity; use WATCH for optimistic concurrency, not for rollback semantics.
Lua / Functions	Atomic compute for blocking risk	Multi-step logic runs atomically, no round trips, no race window	Runs on the main thread; a slow/looping script freezes every client	A script iterates a growing collection and p99 for the whole instance spikes to seconds	Scripts must be deterministic (no random/time without seeding) for replication safety. Keep them O(small). Effective lock-and-do primitive.
Caching	Speed for staleness and a new failure surface	Sub-ms reads, massive backing-store offload	Stale data, cache-coherence complexity, a new dependency that can fail	Mass-expiry stampede hammers the database; or eviction near maxmemory craters hit rate	TTL jitter, request coalescing (single-flight), and treating misses as graceful fallback (not errors) separate robust caches from fragile ones.
Bloom Filter	Tiny memory for false positives + no delete	Membership test for 100M items in MB, not GB	Configurable false-positive rate; cannot remove items; cannot enumerate	The filter fills past its sizing and FPR climbs, silently letting dupes through	Size for peak cardinality up front; FPR is fixed at creation. Cuckoo filter if you need deletes. Never use when a false "yes" causes a correctness bug.
Geospatial	Fast radius for GIS limitations	O(log N) point-in-radius using geohash on a ZSET	No polygons, no routing, inherits ZSET memory cost	Product asks for "inside this delivery polygon" and the geohash model can't express it	It is a ZSET underneath; you can mix geo and score ops. For anything beyond circles/boxes, you need real GIS.
Vector Sets	Co-located similarity for maturity risk	Native KNN in Redis, one fewer system, quantization for memory	Beta API (may break), memory-heavy index, fewer features than dedicated vector DBs	You build on the beta API and a Redis 8.x point release changes it	HNSW recall/latency tunable via build params. Great for "already on Redis." Redis Query Engine is the more enterprise-grade in-Redis vector path.
HyperLogLog	Fixed tiny memory for approximation + no members	~12 KB counts a billion uniques at ~0.81% error	Approximate only; stores no members; cannot answer "which"	Someone needs the list of unique users, not the count, and the data simply isn't there	Mergeable across shards/time windows via `PFMERGE`, which is its superpower for distributed cardinality. Wrong tool whenever exactness or membership is required.
Redis (all)	In-memory speed for durability + RAM cost	Microsecond ops, rich structures	Working set must fit RAM; async replication is lossy on failover	Dataset outgrows economical RAM, or a failover drops acknowledged writes for a payment	The meta trade-off behind every feature. Speed comes from RAM and single-threading; durability and unbounded scale are what you surrender.

07. Use Cases (who runs this in production and why)

Real patterns with the specific property driving the choice. The "Why Not Alternative" column is where PE judgment lives.

Use Case	Feature	Driving Property	Scale Dimension	Why Not the Alternative
API rate limiting	Atomic counter / ZSET sliding window	Race-free increment + TTL in one atomic op, sub-ms decision	Millions of checks/sec across keys	A SQL row lock per request would add ms and contend; Redis decides in microseconds with no lock
Gaming leaderboard	Sorted Set	O(log N) rank + range queries, "top 100" and "my rank" instantly	10M+ players, real-time updates	SQL `ORDER BY ... LIMIT` with a rank window is expensive and slow at write rate; ZSET keeps it sorted incrementally
Reliable job queue	Streams + consumer groups	At-least-once delivery, acks, redelivery of crashed-worker jobs	100K+ jobs/sec, multiple worker pools	Lists lose in-flight jobs on crash; Kafka is heavier ops for this scale and latency target
Live notification fan-out	Pub/Sub (sharded in cluster)	Lowest-latency broadcast to connected clients, no storage	Fan-out to thousands of subscribers	Streams add storage you don't need for ephemeral "user is typing"; Pub/Sub is leaner when loss is fine
Session / cache store	Strings/Hashes + TTL + LRU	Sub-ms reads, automatic expiry, backing-store offload	100K+ req/sec, sub-ms p99	Hitting Postgres for every session read would not survive the QPS; this is the canonical cache
Unique visitor counting	HyperLogLog	Billion-cardinality distinct count in fixed 12 KB	Billions of events, thousands of dimensions	A Set of all visitor IDs is gigabytes per dimension; HLL is 12 KB at <1% error
Dedup / "already processed"	Bloom filter	Membership test for huge ID space in MB with tolerable FPR	100M+ distinct IDs	A Set costs GB; a DB lookup per item adds latency; Bloom answers in microseconds in MB
Nearby search (ride-hail, delivery)	Geospatial	Point-in-radius ranking in O(log N) via geohash	Millions of moving points, frequent updates	PostGIS is overkill for circle queries and slower to update at this churn; geo-ZSET updates are O(log N)
Recommendations / semantic search	Vector Sets / Query Engine	KNN over embeddings co-located with app data	Millions of vectors, low-latency similarity	A separate vector DB adds a system and a network hop; if you already run Redis, co-location wins for moderate scale
Distributed lock	SET NX PX / Lua (Redlock pattern)	Atomic acquire-with-expiry, single-threaded guarantees	Coordination across many app nodes	ZooKeeper/etcd give stronger guarantees but heavier ops; Redis locks are fast and "good enough" with care
Event sourcing / audit log	Streams	Append-only, ordered, replayable with consumer offsets	Medium-retention event history	Kafka for long retention/multi-DC; Streams when retention is bounded and you want it in your existing Redis
Atomic multi-step business logic	Lua / Functions	Read-compute-write with zero race window, no round trips	High-contention single-key logic	Application-side transactions race; MULTI can't do conditional logic; Lua runs it all atomically server-side

08. Limitations (severity, workaround, and what the workaround costs)

Honest constraints, ranked by how badly they hurt and what you pay to dodge them.

Critical High Medium

Limitation	Affects	Severity	Workaround	Workaround Cost
Single slow command blocks all clients	Lua, big-collection ops, `KEYS`	Critical	SCAN instead of KEYS, bound ranges, UNLINK for big deletes, keep Lua O(small)	Discipline + code review; SCAN is cursor-based and more complex than KEYS
Async replication loses writes on failover	Replication, durability	Critical	`WAIT`, `min-replicas-to-write`, accept Redis isn't a system of record for money	WAIT adds replica RTT to writes; never fully closes the loss window
Working set must fit in RAM	All features	High	Cluster to add nodes, Redis-on-Flash (enterprise), aggressive TTLs/eviction	Cluster ops complexity; flash trades latency; eviction means data loss by design
Pub/Sub at-most-once, no buffering	Pub/Sub	High	Switch to Streams for any delivery guarantee	Streams cost memory and require ack/PEL management
MULTI/EXEC has no rollback	Transactions	High	Use Lua for conditional atomic logic; validate before queueing	Lua is harder to maintain and can block; validation adds round trips
Fork (RDB/AOF rewrite) COW spike + main-thread stall	Persistence	High	Headroom for COW, replica-side saves, smaller instances, schedule off-peak	Wastes RAM headroom; replica-side saves complicate topology
Cross-shard multi-key ops unsupported in cluster	Cluster, MULTI, Lua, MGET	High	Hash tags to co-locate related keys in one slot	Forces key-space design up front; hot-slot risk if over-co-located
Eviction (LRU/LFU) is approximate	Caching, memory	Medium	Raise `maxmemory-samples` for better accuracy	More samples = more CPU on the hot path per eviction
Bloom filter can't delete; FPR rises as it fills	Bloom filter	Medium	Size for peak up front; Cuckoo filter for deletes; scalable Bloom variants	Over-sizing wastes memory; Cuckoo has its own quirks
Vector Sets are beta (API may change)	Vector Sets	Medium	Pin version, use Redis Query Engine for stable vector search	Query Engine is heavier; pinning delays upgrades
Hot key/slot cannot be spread by cluster	Cluster, counters, ZSET	Medium	Sub-key sharding, client-side caching, read replicas	Read-time aggregation cost; cache coherence complexity
HyperLogLog/Bloom give no members, only answers	Probabilistic types	Medium	Keep a separate exact structure if you ever need enumeration	Defeats the memory saving that motivated the probabilistic type

09. Fault Tolerance

How Redis behaves when things break, by deployment topology. Single instance vs replicated (Sentinel) vs Cluster.

Dimension	Single Instance	Replicated (Sentinel)	Cluster
Replication model	None	Async leader-follower	Async per-shard leader-follower
Failure detection	External monitoring only	Sentinel quorum agreement	Gossip PFAIL→FAIL by node majority
Failover mechanism	Manual restart	Sentinel promotes replica, reconfigures	Replica wins master-quorum vote, takes slots
RTO (typical)	Minutes (manual)	Seconds to tens of seconds	Bounded by node-timeout + vote, seconds
RPO (typical)	Last persistence point (minutes)	> 0, unshipped writes lost	> 0, localized to failed shard
Split-brain behavior	N/A (one node)	Isolated old master serves stale writes briefly; discarded on rejoin	Minority side stops serving its slots after losing quorum
Blast radius, single node	Total outage	Write outage during failover; reads survive on replicas	Only that shard's slots; rest of cluster serves
Cross-region failover	None	Async cross-region replica, lossy manual promotion	Enterprise Active-Active (CRDT) or async; vanilla is single-region
Data loss scenarios	Crash before persist, disk loss	Failover window, fork-OOM, split-brain writes	Same + slot-migration edges + hot-slot overload

10. Sharding

How Redis partitions data across nodes, and the constraints that surprise teams migrating from a single instance.

Dimension	Single / Replicated	Redis Cluster
Sharding model	None, single keyspace	Hash slots: `CRC16(key) mod 16384`, fixed slot count
Shard key constraints	N/A	Multi-key ops require same slot; hash tags `{tag}` co-locate
Rebalancing mechanism	N/A	Reassign slots between nodes; only moved slots' keys migrate
Rebalancing cost/impact	N/A	Real data movement + ASK redirection latency during migration
Hot-shard behavior	Single thread caps one key	Cluster spreads keys, but a single hot key/slot still bottlenecks one node
Maximum shards (practical)	1 master	Hundreds of masters; gossip overhead grows with node count
Resharding without downtime?	N/A	Yes, online; clients follow MOVED/ASK during migration
Cross-shard query support	Full (one node)	Single-key always; multi-key only within a slot, else CROSSSLOT error

Why 16384 slots: the slot bitmap is gossiped in every heartbeat (2 KB), cheap enough to exchange constantly across a large cluster while still fine-grained for balancing. Decoupling key→slot→node is what lets you add a node by moving slots instead of rehashing the world.

11. Replication

The data-copy story, including the consistency knobs people forget exist.

Dimension	Redis Replication
Topology	Leader-follower; replicas can have sub-replicas (replication tree) to cap master egress
Sync vs async	Asynchronous by default; writes ack before replicas confirm
Replication factor	Default 0 (standalone); typical 1-2 replicas per master; no hard max but egress grows
Consistency level options	`WAIT n ms` blocks for N replica acks; `min-replicas-to-write` refuses writes below a threshold; never linearizable
Replication lag (typical)	Sub-ms to low-ms intra-region under normal load; spikes on big-key writes and full resync
Conflict resolution	None needed for single-leader; Enterprise Active-Active uses CRDTs for multi-master merge
Cross-region replication	Async replica or Active-Active (Enterprise, CRDT-based); cross-region lag is real and lossy on failover
Replication during partition	Master keeps serving (AP-leaning); isolated writes on a minority master are discarded after failover unless min-replicas blocks them
Resync efficiency	PSYNC partial resync from the replication backlog buffer avoids full RDB transfer on brief disconnects

12. Better Usage Patterns (where PE depth shows)

The patterns most teams get wrong, the better way, and why it compounds at scale. This is the table to read twice.

Pattern	What Most Teams Do Wrong	The Better Way	Why It Matters
Listing keys	Run `KEYS pattern*` in production	Use `SCAN` with a cursor, or maintain an index Set	KEYS is O(N) over the whole keyspace and blocks every client; one KEYS on a big DB is an instant p99 incident
Messaging	Reach for Pub/Sub by default	Default to Streams; use Pub/Sub only when loss is truly fine	Pub/Sub silently drops on disconnect; teams discover the gap only when a consumer was down during an important event
Queues	Lists + manual ack hacks	Streams with consumer groups, PEL, and `XCLAIM` for abandoned work	Hand-rolled ack/retry on Lists reimplements Streams badly and loses jobs on crash
Cache expiry	Identical TTL on a batch of keys	Add random jitter to TTLs (e.g. base ± 10%)	Synchronized expiry causes a thundering herd on the backing store at the TTL boundary
Cache miss handling	Every miss hits the DB independently	Single-flight / request coalescing so one miss populates for all waiters	A hot key expiring under load sends thousands of concurrent DB queries (stampede)
Big deletes	`DEL bigkey` on a 10M-element collection	`UNLINK` for lazy background free	DEL frees memory synchronously on the main thread, blocking everyone for the duration
Conditional updates	GET, decide in app, SET (race window)	Lua/Functions or `WATCH`+MULTI for atomic check-and-set	The read-modify-write gap lets concurrent clients clobber each other; classic lost-update bug
Memory modeling	One key per data point (1M keys)	Group into hashes; small ones use compact listpack encoding	Per-key overhead (~80 bytes) dominates for tiny values; hashes cut memory by an order of magnitude
Hot counters	Single `INCR` key for a viral object	Shard into N sub-counters, sum on read	One hot key can't be spread by cluster; sub-key sharding distributes the write load across slots
Persistence as durability	Assume RDB/AOF means "we never lose writes"	Treat persistence as DR/fast-restart; use replication + WAIT for availability, and a real DB for true durability	Conflating the two leads to using Redis as a system of record for money, then losing acknowledged writes on failover
Lua scripts	Long inline EVAL with loops over data	Tiny deterministic Functions, named and versioned; offload heavy work	A slow script freezes the whole instance; non-determinism breaks replication and AOF replay
Probabilistic types	Use HLL/Bloom then later need the actual members	Decide up front whether you'll ever need enumeration; if maybe, keep an exact structure too	HLL/Bloom discard members by design; retrofitting exactness means you lost the data you summarized

13. Advanced / Next-Gen Alternatives

Where to look when Redis hits a wall, and how mature each option is. Includes the 2024-2026 fork/competitor landscape you should know for any interview.

Production-ready Emerging Early

Alternative	What It Improves	Maturity	Migration Cost	When To Consider
Valkey (Linux Foundation fork)	BSD license, no copyleft/commercial strings; AWS/Google/Oracle backed; wire-compatible with Redis 7.2	Production	Near-zero, drop-in for Redis 7.2-era features (no Vector Sets)	You want true open-source licensing and the hyperscaler-backed path; many managed services moved here
Dragonfly	Multi-threaded, shared-nothing core; far higher single-node throughput and memory efficiency; fork-free snapshots	Production	Drop-in (Redis/Memcached API compatible), but verify command coverage and cluster semantics	One node's single-core ceiling is your bottleneck and you'd otherwise run many shards on one big box just to use the cores
KeyDB	Multi-threaded fork of Redis with fine-grained locking; active-replica multi-master	Production	Drop-in; lock model differs, watch atomicity edge cases	You want multi-core on a Redis-compatible engine without changing your data model
Kafka / Redpanda	Durable, long-retention, high-throughput log; partitioned consumer groups at massive scale	Production	High: different model (offsets, partitions, brokers), new ops	Streams outgrows you: weeks of retention, multi-DC, exactly-once pipelines, very high throughput
RabbitMQ	Rich broker semantics: exchanges, complex routing, transactional messaging, AMQP guarantees	Production	High: full broker, different protocol and ops	You need real broker routing/guarantees that Pub/Sub and Streams don't provide
Dedicated vector DB (Milvus, Qdrant, pgvector)	Billion-scale ANN, rich metadata filtering, mature tooling vs beta Vector Sets	Production	Medium: new system + embedding pipeline integration	Vector workload is core and large; Vector Sets' beta status or feature set is limiting
Redis Query Engine (in-core, Redis 8)	Full-text + vector + numeric + geo querying with secondary indexes, all inside Redis	Production	Low: same Redis, add index definitions	You need richer querying than raw structures and want to stay in Redis; more enterprise-grade than beta Vector Sets
Redis-on-Flash (Enterprise)	Tiered RAM+SSD so datasets exceed RAM economically	Production	Medium: enterprise offering, latency trade-off	Working set is too big for pure RAM but you want Redis semantics
Active-Active CRDT (Enterprise)	True multi-region multi-master with conflict-free merge	Emerging	Medium-high: CRDT semantics change app assumptions	Multi-region writes with automatic conflict resolution are a hard requirement
Momento / serverless cache	Zero-ops managed caching, pay-per-use, no cluster management	Emerging	Low-medium: API differs from raw Redis	You want caching without operating Redis at all

Interview-relevant context: in March 2024 Redis left BSD for RSALv2/SSPLv1, triggering the Valkey fork (Linux Foundation, BSD, backed by AWS/Google/Oracle). Redis 8 (GA late 2025) added AGPLv3 as a third option and merged the former Stack modules (Search, JSON, TimeSeries, probabilistic, Vector Sets) into the open-source core. Knowing this saga signals you track the ecosystem, not just the API.

Best default choices

Search and compare

01. Overview — What Redis Is and Why It Exists

What it is (technical)

What it is for

Why it is needed (the problem it solves)

The defining choice: single thread + RAM

02. Core Concepts — The Mental Model

Keyspace and databases

The RESP protocol (Redis Serialization Protocol)

The event loop and single-threaded execution

Objects, types, and encodings

Expiration: lazy + active

Persistence: RDB, AOF, hybrid

Replication: leader-follower with PSYNC

Sentinel vs Cluster

Hash slots and the same-slot rule

Memory ceiling and eviction

03. Architecture — How the Pieces Fit

04. Features — The Thirteen That Define Redis

05. When To Use Which Redis Feature (the headline)

Role-level: which "Redis-as-X" claim actually holds up

06. Trade-Offs (what you give up to get what)

07. Use Cases (who runs this in production and why)

08. Limitations (severity, workaround, and what the workaround costs)

09. Fault Tolerance

10. Sharding

11. Replication

12. Better Usage Patterns (where PE depth shows)

13. Advanced / Next-Gen Alternatives