Redis Features — PE Trade-Offs Analysis

Every Redis capability you listed, analyzed at Principal Engineer depth: Pub/Sub, Streams, Task Queues, Atomic Counters, Sorted Sets, Transactions, Lua, Bloom Filters, Geospatial, Vector Sets, HyperLogLog, Caching, plus the four "role" uses (primary NoSQL DB, streaming engine, message broker, cache). The headline you asked for is the "When To Use Which Feature" decision matrix up top. Then the eight mandatory deep tables.

In-Memory Datastore Data Structures L6/L7 Depth

As of 2026-05-21 · Redis 8.x (GA Nov 2025, modules merged into core, Vector Sets native/beta, tri-licensed AGPLv3 / RSALv2 / SSPLv1). Behavior also applies to Valkey 8.x except Vector Sets.

PE Verdict — the one thing to internalize

Redis features are not interchangeable; each is a different data structure with a different cost curve, and choosing wrong shows up as either silent memory blowup or a blocked main thread. The recurring trap: people reach for Pub/Sub when they need durable delivery (use Streams), use Lists as a queue when they need consumer groups and acks (use Streams), and treat Redis as a primary database when its async replication and lossy failover make it a fast cache with persistence, not a system of record. Match the structure to the access pattern, and never forget every command runs on one shared thread.

Best default choices

01. Overview — What Redis Is and Why It Exists

Redis (REmote DIctionary Server) is an in-memory data structure server, not a cache, not a database, not a queue, though it is used as all three. Created by Salvatore Sanfilippo in 2009 to solve a specific pain (a real-time web analytics dashboard hitting MySQL too hard), it has become the default substrate for sub-millisecond data access across the industry. The defining design choice: the server holds the data structures, not just the bytes, so clients push operations to the data instead of round-tripping for every step.

What it is (technical)

  • In-memory key/value store where values are rich data structures: strings, hashes, lists, sets, sorted sets, streams, bitmaps, HyperLogLogs, geospatial indexes, and (Redis 8) vectors and JSON.
  • Single-threaded command execution on an event loop (epoll/kqueue), with I/O threads (Redis 6+) parallelizing socket reads/writes and protocol parsing.
  • RESP protocol over TCP: a simple, human-readable, binary-safe text protocol designed for fast parsing and pipelining.
  • Optional durability via RDB snapshots, AOF append-log, or hybrid; optional HA via replication + Sentinel; optional scale via Redis Cluster (16384 hash slots).
  • Tri-licensed as of Redis 8 (May 2025): AGPLv3, RSALv2, or SSPLv1. Valkey is the BSD-licensed Linux Foundation fork from Redis 7.2.

What it is for

  • Caching in front of slower stores (the original and still-dominant use).
  • Session and ephemeral state at web scale: presence, rate limits, feature flags.
  • Real-time computation: leaderboards, counters, dedup, top-K, cardinality, geospatial proximity.
  • Lightweight messaging: Pub/Sub for broadcast, Streams for durable consumer groups.
  • Coordination primitives: distributed locks, atomic counters, sequence generators.
  • Vector similarity + RAG: Vector Sets and Redis Query Engine for AI/ML workloads alongside operational data.

Why it is needed (the problem it solves)

Disk-based databases are too slow for the read path of modern apps. A Postgres query at ~5–50 ms is fine for transactional writes but unacceptable for the 50–200 reads per user request that a modern web app makes (session lookup, rate check, feature flags, personalization, recommendations). Caching solved the latency side, but a flat key-value cache forces every multi-step operation to round-trip: read the score, increment it, write it back, with a race window in the middle. Concurrency bugs followed.

Redis collapses that round trip by moving the data structures into the server itself. "Increment this counter and check the result" is one network call (INCR) instead of three. "Add this member to a leaderboard and return its rank" is one call (ZADD + ZRANK). Because every command runs single-threaded against in-memory state, there is no race window, no lock to acquire, no transaction to coordinate. Atomicity is the byproduct of how the engine is built, not a feature you opt into.

The second problem Redis solves is that not all data fits the row-and-column model. A leaderboard is a sorted set. A presence list is a set. An event log is a stream. A "have we seen this device ID" check is a Bloom filter. Forcing these into SQL tables either wastes memory or makes queries slow. Redis exposes each as a first-class server-side type with operations matched to its shape.

The defining choice: single thread + RAM

Every other Redis property descends from this. Single-threaded execution means atomicity without locks. RAM means commands finish in microseconds, so the single thread is never idle and never starves clients. The two reinforce each other: a single-threaded disk database would be absurd; a single-threaded memory database is the simplest design that gives you both speed and correctness. Everything Redis "cannot do" (working set must fit RAM, one slow command blocks all clients, async replication can lose writes on failover) is a direct cost of this choice. Everything Redis is loved for (sub-ms p99, race-free counters, predictable behavior) is a direct benefit.

02. Core Concepts — The Mental Model

The vocabulary you need to speak Redis fluently in design review. Each concept here is something the engine actually does at runtime; understanding them is the difference between using Redis and reasoning about how it will behave under load.

Keyspace and databases

A Redis instance has a flat keyspace: every key is a string in a single global namespace. Logical separation comes from numbered databases (0-15 by default) via SELECT, but this is a legacy feature. Cluster mode supports only DB 0, and even in standalone mode the community guidance is to use key prefixes (user:123:cart) rather than separate DBs. The keyspace itself is a hash table with incremental rehashing: when load factor crosses a threshold, Redis allocates a new bigger table and moves keys gradually over many operations so no single command pays the full rehash cost.

The RESP protocol (Redis Serialization Protocol)

The wire format is deliberately simple: requests are arrays of bulk strings, responses are one of five types (simple string, error, integer, bulk string, array). It is text-based but binary-safe (bulk strings carry an explicit length). Why it matters: parsing cost is tiny, which lets the I/O threads handle millions of small messages with little CPU. RESP3 (Redis 6+) added attribute types and out-of-band push frames for client-side caching and richer pipelining.

The event loop and single-threaded execution

One thread, one event loop using epoll (Linux) or kqueue (BSD). On each iteration: poll ready file descriptors, read commands from ready client sockets, execute them one at a time against the in-memory data structures, write responses back. No two commands ever interleave. This is why INCR is atomic without a lock, why SETNX is the canonical distributed-lock acquire, and why a single long-running command (KEYS *, a huge ZRANGE, an unbounded Lua script) stalls every other client for its entire duration.

Objects, types, and encodings

Every value is wrapped in a robj (Redis object) header that carries its logical type (string, hash, list, etc.) and its physical encoding. Small collections use compact encodings — listpack for small hashes/lists/zsets, intset for small integer-only sets — that pack data into a single contiguous buffer for cache efficiency. When a structure grows past size/element thresholds it transparently upgrades to a full hash table or skip list. Knowing this is how you understand why "1000 small hashes" is dramatically cheaper than "1000 keys with the same fields."

Expiration: lazy + active

TTLs are stored in a separate expires dict keyed by the same key. Two reclamation mechanisms run together: passive (when a client accesses a key, check expiry then delete if past TTL) and active (a background cycle samples keys from the expires dict, deletes the expired ones, and if the sampled expired fraction is high, repeats). No per-key timers, no full keyspace scan. The cost: a key past its TTL but never read again sits in RAM until the active cycle reaches it; under heavy short-TTL workloads this lag can push you toward maxmemory.

Persistence: RDB, AOF, hybrid

RDB is a point-in-time snapshot written by a forked child (copy-on-write), small and fast to load, but loses everything since the last save. AOF is an append-only log of write commands, replayed on restart, with appendfsync tunable from always (per-write fsync, slow) to everysec (~1s loss window, the practical default) to no (OS-flushed). Hybrid (modern default for AOF rewrites) writes an RDB preamble plus AOF tail: fast restart with bounded loss. None of this gives you the durability of a true ACID database, even with appendfsync always, because failover is async-replicated and lossy.

Replication: leader-follower with PSYNC

Async replication. On first connect, the master forks an RDB and ships it; subsequent writes stream over the replication link. PSYNC (partial resync) lets a briefly-disconnected replica catch up from a circular replication backlog buffer on the master, avoiding a full RDB transfer on every network blip. Replicas are read-only by default and serve reads for scale, but they are eventually consistent: a write to the master is visible on replicas after the link latency, and on failover the unshipped tail is lost.

Sentinel vs Cluster

Two failover models. Sentinel is a separate quorum of monitor processes that watch a master-replica set, agree it has failed (avoiding one bad observer triggering a false failover), promote a replica, and reconfigure the others. Redis Cluster partitions the keyspace into 16384 hash slots distributed across master nodes, with replicas attached to each master; failover is handled by the cluster itself via gossip and master-quorum voting. Sentinel is simpler when one node holds the dataset; Cluster is required when you need to exceed one node's RAM or CPU.

Hash slots and the same-slot rule

In Cluster mode, slot = CRC16(key) mod 16384. Each master owns a contiguous range of slots. Multi-key commands (MGET, MULTI, Lua touching multiple keys) only work if all keys land in the same slot. The escape hatch is hash tags: only the substring between {} is hashed, so user:{42}:cart and user:{42}:profile co-locate. Designing your key space for co-location is a one-time cost that pays back forever; retrofitting it after cluster migration is painful.

Memory ceiling and eviction

maxmemory caps total RAM. When the cap is hit, the configured maxmemory-policy decides: reject writes (noeviction, correct for a primary store), evict approximate LRU/LFU across all keys (allkeys-*, correct for a cache), or evict only TTL-bearing keys (volatile-*, correct for mixed workloads). Eviction is approximate: Redis samples N keys (default 5) and evicts the best candidate from the sample, trading a little accuracy for not maintaining a globally sorted structure on the hot path.

03. Architecture — How the Pieces Fit

A Redis instance is a single process. One thread executes commands against in-memory structures. I/O threads parallelize the kernel network stack. A forked child handles persistence. Replicas receive a write stream. In Cluster mode, every node also runs a gossip protocol on a separate bus. This is the picture an L7 candidate should be able to draw in under two minutes.

Clients app servers, SDKs RESP / TCP / TLS I/O Threads socket read/write RESP parse/encode parallel, no cmd logic Main Thread epoll/kqueue loop • command dispatch • ALL mutations here • MULTI/EXEC • Lua / Functions • expiry cycle → atomic for free one slow cmd = stall all Keyspace (in-RAM) hash table + rehash str · hash · list · set zset · stream · bitmap HLL · geo · vector · json expires dict (TTL) listpack / intset compact encodings maxmemory + eviction Modules (merged in Redis 8) Search / Query JSON TimeSeries Bloom / Cuckoo TopK / CMS / TDigest Vector Sets (beta) native data types Fork child (BG) RDB snapshot (COW) AOF rewrite fork() = main-thread stall Disk dump.rdb appendonly.aof fsync everysec Replicas PSYNC + repl backlog async, read-only partial resync on blip Cluster Bus gossip (PING/PONG) 16384 hash slots PFAIL → FAIL vote Failover control plane Sentinel quorum (non-cluster) · Cluster gossip + master vote (cluster mode)
Main Thread (red). The sacred one. All command execution, all mutations, all atomicity guarantees. One slow command here stalls every client. The entire engine is designed around keeping this thread fast and never blocked.
I/O Threads (blue). Added in Redis 6, expanded in 8. Parallelize the expensive kernel work (socket I/O, RESP parsing) while leaving command execution serial. This is how you get multi-core network throughput without losing single-thread atomicity.
Keyspace (gray). The in-RAM hash table holding all keys, with a parallel expires dict for TTLs. Small structures use compact listpack/intset encodings; large ones promote to full hash tables and skip lists. Memory is the ceiling.
Modules (teal, dashed). Once-separate Stack modules (Search, JSON, TimeSeries, Bloom, Cuckoo, TopK, CMS, TDigest, Vector Sets) are merged into the Redis 8 core as native data types. No more loadmodule directives for these features.
Fork child (teal). Persistence runs in a forked subprocess that sees a copy-on-write view of memory. The fork itself stalls the main thread proportional to heap size (page table copy), and COW can spike RSS during the save.
Replicas (purple). Async PSYNC stream. Partial resync from a circular backlog buffer avoids a full RDB transfer on every brief disconnect. Replicas are read-only; failover loses the unshipped tail.
Cluster Bus (gray). A separate TCP port for node-to-node gossip in Cluster mode: PING/PONG carrying topology, slot ownership, and failure suspicions. Failure detection by majority vote, no central coordinator.
Failover layer (dashed). Either Sentinel (standalone topologies) or the cluster's own gossip + master-quorum vote (cluster mode). Detects master failure, promotes a replica, reconfigures clients.

The architectural principle to internalize: the main thread is sacred, and everything that can be moved off it has been. I/O off the main thread (I/O threads). Persistence serialization off the main thread (forked child). Big-key freeing off the main thread (lazy free, UNLINK). Cluster topology gossip off the main thread (separate bus). Any time you propose a feature that would do work on the main thread, the Redis design instinct is to ask "can this be deferred, sampled, or moved to a background helper?" That instinct is what keeps p99 in microseconds.

04. Features — The Thirteen That Define Redis

Each feature explained as a PE would explain it on a whiteboard: what it is for (the access pattern it serves), why it is needed (the problem it solves that no plain key-value can), a real command example, and the gotcha that separates "I read the docs" from "I have run this in production."

01In-Memory StorageThe foundation
What it is for

Holding the entire working dataset in RAM, so every read and write completes in microseconds without touching disk. The dataset is structured by type (string, hash, list, set, zset, stream, etc.) and accessed through type-specific commands, not generic blob reads.

Why it is needed

Disk I/O is the latency wall. A spinning disk is milliseconds per random seek; even NVMe is tens of microseconds. RAM is hundreds of nanoseconds. For workloads that read the same data many times per request (sessions, rate limits, feature flags, leaderboards, top-K lookups), holding the working set in RAM is the only path to sub-ms p99. The trade is real: working set must fit in memory, and RAM is roughly 100× the cost of disk per byte, so you size aggressively and lean on eviction or sharding when the set outgrows one node.

# sub-millisecond point reads
SET user:42:session "<token>" EX 3600.      # 1 hour TTL
GET user:42:session                       # ~50µs round-trip

# inspect memory layout
MEMORY USAGE user:42:session
INFO memory                              # used_memory, fragmentation
PE nuance

Per-key overhead is real: ~50–90 bytes of metadata (dict entry, robj header, SDS header, expire entry) before your value. A billion tiny keys is ~80 GB of overhead alone. The PE move is to model data as a few large structures (hashes, zsets, streams) instead of many tiny keys; small ones get the listpack/intset compact encoding and pay almost no overhead.

02Flexible PersistenceRDB · AOF · Hybrid
What it is for

Surviving a process restart or crash without losing the working set. Three modes you mix to taste: RDB (periodic point-in-time snapshot), AOF (append-only command log), and hybrid (RDB preamble + AOF tail, the modern default during AOF rewrites).

Why it is needed

RAM is volatile. Without persistence, a restart starts from an empty dataset, which means a cold cache stampede on your backing store and minutes-to-hours of pain for sessions, rate limits, and any "ephemeral" data your app secretly depends on. Persistence is what turns Redis from "a cache that forgets" into "a cache that recovers." Critically, this is disaster recovery, not durability: even with appendfsync always, async replication can lose unshipped writes on failover.

# RDB: snapshot if 1000+ keys changed in 60s
save 60.  1000. 
save 300.  10. 

# AOF: append every write, fsync once per second
appendonly yes
appendfsync everysec        # sane default; ~1s loss window

# Modern hybrid rewrite (RDB preamble + AOF tail)
aof-use-rdb-preamble yes

# Trigger ops
BGSAVE                          # non-blocking RDB
BGREWRITEAOF                    # compact AOF
LASTSAVE                        # unix timestamp of last save
PE nuance

Both RDB and AOF rewrite call fork(). Copy-on-write means the child sees a frozen view, but on a large heap under heavy writes COW can balloon RSS by tens of GB during the save, and the fork itself stalls the main thread for ms-to-seconds (page table copy). On a memory-pressured box, the fork can trigger the OOM killer during what was supposed to be a safety operation. Watch latest_fork_usec, leave RAM headroom, and prefer running saves on a replica.

03Diverse Data StructuresThe product
What it is for

Server-side data types matched to the shape of the problem: strings for blobs and counters, hashes for objects, lists for queues and timelines, sets for unique membership, sorted sets for ranking, streams for event logs, plus bitmaps, HyperLogLogs, geospatial indexes, and (Redis 8) JSON, TimeSeries, and Vector Sets. Commands are matched to each type's natural operations.

Why it is needed

A plain key-value store forces every multi-step operation through the network. "Add to leaderboard and return rank" becomes read-update-write, with a race window. "Get top-10 nearby drivers" becomes scan-all-then-sort in the client. By moving the structure into the server, Redis turns these into single atomic commands. This is why Redis is called a data structure server, not a cache. The structures are the product; the in-memory storage is just what makes them fast.

# Hash: store a user object compactly
HSET user:42 name "Alice" tier "gold" joined 1717. 
HGETALL user:42

# Sorted Set: real-time leaderboard
ZADD leaderboard 9500.  alice 8200.  bob 9800.  carol
ZREVRANGE leaderboard 0.  9.  WITHSCORES   # top 10
ZRANK leaderboard alice                  # instant rank

# Stream: durable event log with consumer groups
XADD events * type purchase user 42.  amount 99. 
XGROUP CREATE events analytics $ MKSTREAM
XREADGROUP GROUP analytics worker-1 COUNT 10.  STREAMS events >
PE nuance

Each type has a small-size encoding (listpack, intset) that flips to a "real" structure past a threshold. Tune those thresholds (hash-max-listpack-entries, zset-max-listpack-entries) for memory efficiency. The other instinct to develop: composition. Real PE designs combine 2-3 structures (a Geo set for location, a Stream for the event log, a Sorted Set for ranking) rather than forcing one structure to do everything.

04Atomic OperationsFree correctness
What it is for

Every single Redis command runs atomically against the keyspace. INCR is a race-free read-modify-write. SETNX is an atomic acquire-if-absent. ZADD is an atomic insert-or-update. No client-side locking, no compare-and-swap loop, no transaction overhead for single-key ops.

Why it is needed

The hardest bugs in distributed systems are races on shared counters and shared flags: two clients reading the same value, both incrementing locally, both writing back, one increment lost. The fix in a normal database is a row lock or an explicit CAS retry loop, both of which add latency and can deadlock. Redis sidesteps all of it because the single-threaded engine serializes every command. Atomicity is the byproduct of the architecture, not a feature you opt into. This is why Redis is the default rate-limiter, sequence generator, and distributed-lock primitive across the industry.

# atomic counter — no lock, no race
INCR page:home:views                  # 1
INCR page:home:views                  # 2
INCRBY wallet:42 -150                # atomic debit

# distributed lock acquire (Redlock-style single-instance)
SET lock:order:99 "<uuid>" NX EX 10.       # OK if not held

# atomic conditional update via ZADD flags
ZADD leaderboard GT 9500.  alice         # only update if higher
PE nuance

The single-thread serialization is per-instance, not global. A single hot key (a viral counter, a celebrity leaderboard) cannot be spread across cluster nodes; it concentrates load on one shard's single thread. Mitigation: shard hot counters into N sub-keys client-side (counter:{0..15}), sum on read. Trades read cost for write spread.

05High AvailabilityReplication · Sentinel · Cluster
What it is for

Surviving a node failure with bounded outage and bounded data loss. Three layers: asynchronous replication from master to one or more replicas (read scaling + warm standby), Sentinel for automatic failover in non-cluster topologies (quorum of monitors agrees on failure, promotes a replica), and Redis Cluster's built-in failover via gossip and master-quorum voting in sharded deployments.

Why it is needed

A single Redis instance is one process on one box. Process crash, box reboot, network partition, AZ failure, any of these and the dataset is unavailable for the duration. Modern systems demand seconds-level RTO. Replication gives you a warm copy to promote; Sentinel or Cluster makes the promotion automatic so a human is not the bottleneck at 3am. The honest framing: this gets you availability, not durability. Async replication means a promoted replica may be behind the dead master, so acknowledged writes can be lost on failover.

# replica config
replicaof master.internal 6379. 
replica-read-only yes

# refuse writes if too few replicas connected (shrink loss window)
min-replicas-to-write 1. 
min-replicas-max-lag 10.            # seconds

# synchronous-ish: block until N replicas ack
SET critical:key "value"
WAIT 2.  100.                        # wait up to 100ms for 2 replicas

# Sentinel monitors a master
sentinel monitor mymaster 10.0.0.5 6379.  2.    # quorum=2
sentinel down-after-milliseconds mymaster 5000. 
PE nuance

Size the replication backlog buffer for realistic disconnect durations. The default is small (1 MB), and a replica restart that takes longer than the backlog fits triggers a full resync: the master forks a new RDB and ships the whole dataset. On a busy 50 GB instance, this is a multi-minute event that spikes load right when you can least afford it. Set repl-backlog-size to cover your worst-case disconnect, typically tens to hundreds of MB.

06Horizontal ScalabilityRedis Cluster · 16384 slots
What it is for

Scaling beyond one node's RAM and one core's command throughput. Redis Cluster partitions the keyspace into a fixed 16384 hash slots and distributes them across master nodes. Each slot is a contiguous integer range owned by exactly one master, with replicas attached. Clients are cluster-aware and route directly to the owning node.

Why it is needed

A single Redis instance caps at the largest box you can buy: tens to hundreds of GB of RAM and one core for command logic. Past that, you must shard. Cluster gives you online resharding (move slots between nodes without downtime), automatic failover per shard, and partition tolerance per shard. The fixed 16384-slot count is a deliberate compromise: small enough to gossip a slot ownership bitmap cheaply (~2 KB per heartbeat), large enough to balance finely across hundreds of nodes.

# slot = CRC16(key) mod 16384
CLUSTER KEYSLOT user:42                # → e.g. 12539
CLUSTER NODES                          # topology + ownership
CLUSTER COUNTKEYSINSLOT 12539. 

# Hash tags force co-location into the same slot
MSET user:{42}:name Alice user:{42}:tier gold
# both keys hash on {42} → same slot → MGET/MULTI work

MGET user:{42}:name user:{42}:tier      # OK
MGET user:42:name user:42:tier          # may fail with CROSSSLOT

# Online slot migration (run by orchestrator)
CLUSTER SETSLOT 12539.  MIGRATING <target-node>
CLUSTER SETSLOT 12539.  IMPORTING <source-node>
PE nuance

Cluster gives you more capacity, not magic. A single hot key still pins one shard. Design the key space for co-location up front using hash tags for keys that need to be touched together; retrofitting after cluster migration means rewriting client code and migrating data. And know that some multi-key features (transactions, Lua across keys) require same-slot keys, so cluster mode imposes data-modeling discipline you do not pay on a single instance.

07Pub/Sub MessagingAt-most-once broadcast
What it is for

Real-time broadcast of messages to all currently-connected subscribers of a channel. Publishers send to channels; subscribers receive everything published while their connection is open. Patterns (PSUBSCRIBE) allow wildcard subscriptions. Sharded Pub/Sub (Redis 7+) spreads channels across cluster shards by hash slot for fan-out beyond one node.

Why it is needed

Two classic patterns demand instant fan-out with zero storage: cache invalidation (a write happens, broadcast "key X changed" so every app instance can drop its local copy) and live notifications ("user is typing," presence updates, real-time dashboards). A persistent queue is overkill here because the consumers either receive the message live or it was never relevant. Pub/Sub gives you microsecond fan-out with effectively zero memory cost since nothing is stored.

# subscriber holds a long-lived connection
SUBSCRIBE cache:invalidate

# publisher
PUBLISH cache:invalidate "user:42"     # returns # subscribers reached

# pattern subscription
PSUBSCRIBE news.*                       # news.sports, news.tech, ...

# Sharded Pub/Sub (cluster mode, scales fan-out across shards)
SSUBSCRIBE orders:{shard1}
SPUBLISH orders:{shard1} "<event>"
PE nuance

Pub/Sub is at-most-once. If a subscriber is disconnected, slow, or its output buffer fills, messages are lost forever. There is no per-subscriber buffer, no replay, no offset. The trap teams fall into: using Pub/Sub for anything where missing a message is a correctness bug. The instant you say "but what if the consumer was down," switch to Streams.

08Lua Scripting & FunctionsServer-side atomic compute
What it is for

Running multi-step read-compute-write logic atomically on the server in one round trip. EVAL runs an inline Lua script; SCRIPT LOAD+EVALSHA caches it; Functions (Redis 7+) register named, versioned, persistent script libraries that survive restarts and replicate naturally.

Why it is needed

MULTI/EXEC is isolation but no logic: it cannot read a value and decide what to write next. Application-side "read then write" has a race window. Lua collapses both: the entire script runs atomically on the main thread, with full access to read intermediate values and branch on them. This is how you implement correct distributed locks (release only if the token matches), conditional rate limiters (reset on hour boundary, decrement counter, reject if zero), and atomic transfer-with-balance-check, all in one network call.

# Atomic decrement-if-positive (rate limit)
EVAL "local v = redis.call('GET', KEYS[1])
       if not v or tonumber(v) <= 0 then return 0 end
       return redis.call('DECR', KEYS[1])" 1.  quota:user:42

# Cached: load once, invoke by sha
SCRIPT LOAD "<same script>"            # → sha1
EVALSHA <sha1> 1.  quota:user:42

# Functions: registered, named, versioned
FUNCTION LOAD "#!lua name=mylib
       redis.register_function('decr_if_pos', function(keys, args)
         local v = redis.call('GET', keys[1])
         if not v or tonumber(v) <= 0 then return 0 end
         return redis.call('DECR', keys[1])
       end)"
FCALL decr_if_pos 1.  quota:user:42
PE nuance

A Lua script blocks the entire server for its full duration. A script that loops over a growing collection, calls into a slow command, or has an accidental infinite loop will freeze every client. Keep scripts short and O(small). Make them deterministic (no math.random without seeding, no system time without passing it in) or replication and AOF replay will diverge. Prefer Functions over inline EVAL for anything you maintain: named, versioned, easier to operate.

09Geospatial ProcessingGEO commands on ZSET
What it is for

Storing 2D points (lon, lat) keyed by name, and answering radius and bounding-box queries fast: "find all drivers within 5 km of this rider," "show stores in this map viewport," "rank nearby restaurants by distance." Backed under the hood by a sorted set keyed on a 52-bit geohash, which is why range queries are O(log N).

Why it is needed

Doing geo proximity in a generic database means either a quadtree/R-tree extension (PostGIS), a brute-force distance calculation over every row (does not scale), or a custom spatial index you maintain yourself. Redis bakes a good-enough spatial index into a familiar primitive (the sorted set) and exposes it through simple commands. For the very common case of "points and circles," this beats running a separate GIS database and the network hop to reach it.

# add drivers' current locations
GEOADD drivers -122.4194 37.7749 driver:1
GEOADD drivers -122.4783 37.8199 driver:2

# find drivers within 3km of a rider (modern unified command)
GEOSEARCH drivers FROMLONLAT -122.42 37.78 BYRADIUS 3.  km ASC COUNT 10. 

# bounding box (rectangle in a map viewport)
GEOSEARCH drivers FROMLONLAT -122.42 37.78 BYBOX 5.  5.  km ASC

# distance and position lookups
GEODIST drivers driver:1 driver:2 km    # km between two members
GEOPOS drivers driver:1                 # back to lon/lat
PE nuance

It is a sorted set with geohash scores; every ZSET property applies (per-member memory cost, O(log N) operations, no polygon support). For "point in circle" and "point in box" it is excellent. For polygons, route networks, full GIS, or millions of moving points with complex filters, you want PostGIS or a dedicated geo store. Updating frequently-moving points (ride-hail drivers every few seconds) is fine; the underlying ZSET handles repeated GEOADD as score updates.

10Pluggable ModulesNative types in Redis 8
What it is for

Extending Redis with new data types and commands implemented as dynamically-loaded shared libraries. Historically the way Redis Stack delivered Search, JSON, TimeSeries, Bloom/Cuckoo/TopK/CMS/TDigest, and Graph as separate modules. In Redis 8 (GA Nov 2025), these were merged into the open-source core as native data structures — no more loadmodule directives needed for them.

Why it is needed

The core Redis API can model many problems, but not all. Full-text search needs an inverted index. Semantic search needs an HNSW vector index. Time-series needs compressed downsampled storage. Probabilistic types need their own backing layout. Modules let these live inside Redis with first-class command sets, instead of forcing a separate system + network hop + operational team. The Redis 8 consolidation effectively says: these capabilities are core enough that everyone should have them by default.

# JSON (native in Redis 8)
JSON.SET user:42 $ '{"name":"Alice","tags":["gold","vip"]}'
JSON.GET user:42 $.tags                   # ["gold","vip"]
JSON.ARRAPPEND user:42 $.tags '"founder"'

# Time-series (native in Redis 8)
TS.CREATE temp:room:1 RETENTION 86400000.      # 24h
TS.ADD temp:room:1 * 21.4
TS.RANGE temp:room:1 - + AGGREGATION avg 60000.    # 1-min avg

# Search + Query Engine (native in Redis 8)
FT.CREATE idx:users ON HASH PREFIX 1.  user: SCHEMA name TEXT tier TAG
FT.SEARCH idx:users "@tier:{gold}" LIMIT 0.  10. 

# If running older Redis with modules
loadmodule /path/to/redisearch.so       # pre-Redis 8
PE nuance

Modules run inside the Redis process and share its single thread, so a buggy or slow module command stalls everyone. Vet third-party modules carefully and watch their SLOWLOG behavior. The Redis 8 merge is a net positive — vendor lock-in via "you need Redis Stack" is gone, the modules are now AGPLv3 — but verify your deployment surface (Valkey, for instance, does not ship these; you would use the standalone module repos or stay on Redis).

11Smart Cache Eviction8 policies · approximate LRU/LFU
What it is for

Defining what happens when memory hits maxmemory. Eight policies: noeviction (reject writes), allkeys-lru / allkeys-lfu / allkeys-random (evict from any key), volatile-lru / volatile-lfu / volatile-random / volatile-ttl (evict only TTL-bearing keys). The volatile family lets you mix "evictable cache" keys (with TTL) and "protected persistent" keys (no TTL) in one instance.

Why it is needed

RAM is finite. Without a defined ceiling behavior, an unbounded write workload will eventually OOM the process, taking everything down. Eviction turns that catastrophe into a graceful behavior you chose: silently drop cold cache entries (LRU/LFU for caches), refuse new writes (noeviction for a primary store), or selectively expire (volatile-* for mixed). It is the difference between a cache that degrades gracefully under pressure and one that falls over.

# cap memory and pick a policy
maxmemory 10gb
maxmemory-policy allkeys-lru        # pure cache
maxmemory-samples 10.                 # better accuracy, more CPU

# LFU mode tracks access frequency over time
maxmemory-policy allkeys-lfu
lfu-log-factor 10.                   # counter compression
lfu-decay-time 1.                    # minutes between decays

# mixed workload — evict only keys with TTL set
maxmemory-policy volatile-lru
SET session:abc "<data>" EX 3600.       # evictable
SET config:limits "<data>"             # protected (no TTL)

# inspect what's happening
INFO stats                          # evicted_keys, keyspace_misses
CONFIG GET maxmemory-policy
PE nuance

LRU/LFU are approximate, not exact. Redis samples N keys (default 5) and evicts the best candidate from the sample. True LRU would cost memory and main-thread time on every access. Raising maxmemory-samples to 10 gives near-true LRU at modest CPU cost; this is the right tuning for most caches. And: noeviction with a writing client = errors. If you set noeviction and the cap is hit, your app starts getting OOM errors from Redis. Match the policy to the use case explicitly.

12Transactions (MULTI/EXEC/WATCH)Isolation, not ACID
What it is for

Grouping multiple commands so they execute as one atomic batch with no other client's commands interleaved. MULTI opens the batch, queued commands accumulate, EXEC runs them all in order. WATCH adds optimistic concurrency: if any watched key changes between WATCH and EXEC, the EXEC aborts with nil, letting the client retry (check-and-set).

Why it is needed

Even with single-threaded execution, a sequence of commands sent by one client can be interleaved with commands from other clients. If you read a balance, compute a new value, and write it back as three separate commands, another client can write between your read and your write. MULTI/EXEC closes that gap by deferring execution until all the commands have arrived and running them as one indivisible block. WATCH adds the conditional layer for true read-then-write atomicity.

# Optimistic transfer: check balance, then debit + credit atomically
WATCH account:42
GET account:42                         # >= 150?
MULTI
DECRBY account:42 150. 
INCRBY account:99 150. 
EXEC                                  # nil if account:42 changed since WATCH

# Cancel a pending transaction
MULTI
SET foo bar
DISCARD                               # clears the queue
PE nuance

This is the single most misunderstood Redis feature. MULTI/EXEC has no rollback. If command 3 of 4 fails at runtime (e.g., type error: INCR on a list), commands 1, 2, and 4 still execute. "Atomic" here means "no interleaving," not "all-or-nothing." For true conditional atomic logic, use Lua or Functions — the script can read, branch, and write with no race window and abort cleanly. Use WATCH+MULTI only for optimistic CAS, not for rollback semantics you do not have.

13Bitmaps and HyperLogLogsMassive memory savings
What it is for

Two specialized space-saving structures. Bitmaps are strings treated as bit arrays for exact-membership flags ("did user N visit on day D?"). HyperLogLog (HLL) is a probabilistic cardinality counter: approximate "how many distinct items have I seen" in fixed ~12 KB regardless of true cardinality, with ~0.81% standard error and mergeable across keys.

Why it is needed

The naive way to track "which of 100M users visited today" is a Set of user IDs, which costs gigabytes per day. A Bitmap keyed by user ID stores one bit per user — 12.5 MB for 100M users, regardless of how many actually visited — and gives you exact answers via BITCOUNT and BITOP AND/OR/XOR across days. The naive way to count "how many unique users searched this term across a billion events" is a Set of IDs, which is gigabytes. HLL gives you the count in 12 KB with sub-1% error, and you can PFMERGE across time windows or shards trivially. The right tool when you need answers, not members.

# Bitmap: per-user daily-active flags
SETBIT dau:2026-06-07 42.  1.              # user 42 active today
GETBIT dau:2026-06-07 42. 
BITCOUNT dau:2026-06-07                # exact active-user count

# 7-day retention: bitwise AND across day bitmaps
BITOP AND dau:7day dau:2026-06-01 dau:2026-06-02 ... dau:2026-06-07
BITCOUNT dau:7day                      # users active every day

# HyperLogLog: approximate distinct count, ~12 KB fixed
PFADD uniq:visitors:2026-06-07 user-42 user-99 user-7
PFCOUNT uniq:visitors:2026-06-07       # approx cardinality

# Merge across days (or shards) without losing accuracy
PFMERGE uniq:visitors:weekly uniq:visitors:2026-06-01 uniq:visitors:2026-06-02 ...
PFCOUNT uniq:visitors:weekly
PE nuance

Bitmaps are exact but require a dense ID space (user IDs as small integers). If your IDs are UUIDs you must maintain a separate "UUID → small int" map. HLL is approximate and gives you a count, never the members. The trap is using HLL then later being asked "which users were unique" — that data is gone by design. Decide up front whether you will ever need enumeration; if maybe, keep an exact structure alongside, which defeats the memory savings.

05. When To Use Which Redis Feature (the headline)

The decision table you actually came for. For each feature: the access pattern it is built for, what to pick instead when it does not fit, and the PE verdict on where the line is. Sorted by how often teams misuse it.

FeatureReach For It WhenDo NOT Use It When (use instead)Cost / Complexity (Big-O + memory)PE Verdict
Pub/SubFire-and-forget fan-out to currently-connected subscribers: live notifications, cache-invalidation broadcast, presenceYou need durability, replay, or guaranteed delivery → Streams. Offline subscribers must get missed messages → Streams or KafkaO(N) per publish over N subscribers; near-zero memory (nothing stored)It is a broadcast bus with at-most-once delivery. If a subscriber is down or slow, the message is gone. Default to Streams unless loss is genuinely acceptable.
Lists as Task QueueSimple producer/consumer work queue, one worker pool, loss-on-crash tolerable: LPUSH + BRPOPYou need acks, retries, consumer groups, dead-letter, or replay → Streams. You need delayed/scheduled jobs → Sorted Set by timestampO(1) push/pop; memory = queue depth × item sizeLists are a great simple queue. The moment you say "but what if the worker crashes mid-job," you have outgrown them. Streams add the reliability layer; do not bolt acks onto Lists by hand.
StreamsDurable, replayable event log with consumer groups, per-message acks, and at-least-once delivery: event sourcing, reliable job queues, activity feedsYou need petabyte retention, long-term log storage, or massive multi-DC throughput → Kafka. You only need ephemeral broadcast → Pub/SubO(1) append, O(log N) range; memory grows with retention, must cap via MAXLEN/MINIDThe most under-used Redis feature and the right answer to most "I'm using Lists/Pub-Sub for messaging" problems. It is Kafka-lite: same mental model (offsets, groups, acks) at far smaller scale and far lower ops cost.
Atomic CountersHigh-throughput counts with no race conditions: rate limiting, view counts, inventory decrement, ID generation (INCR/INCRBY)You need exact counts across billions of distinct items cheaply → HyperLogLog (for cardinality). You need per-key counts that must survive any loss → durable DBO(1); 8 bytes per counter value (plus key overhead)The canonical "Redis is atomic for free" win. One thread means INCR needs no lock. This is why Redis is the default rate-limiter and sequence generator.
Sorted Sets (ZSET)Anything ranked or range-queried by a score: leaderboards, priority queues, time-series windows, rate limiting (sliding window via score=timestamp), secondary indexesYou just need membership → Set. You need full-text or multi-field query → Redis Query EngineO(log N) add/rank via skip list; ~64+ bytes per member overheadThe Swiss-army structure. If a problem has "top N," "by rank," "within score range," or "expire oldest," it is a ZSET. Also the substrate Vector Sets were built on.
Transactions (MULTI/EXEC)Grouping commands to run with no other command interleaved, with optimistic concurrency via WATCH (check-and-set)You need rollback on logic error → there is none, use Lua for conditional logic. You need cross-shard atomicity → not supported in clusterO(sum of queued commands); buffers commands until EXECMisnamed. No rollback: if command 3 fails, 1, 2, 4 still execute. It is "isolation + batching," not ACID transactions. For real conditional atomic logic, reach for Lua or Functions.
Lua Scripting / FunctionsMulti-step read-compute-write that must be atomic and avoid round trips: conditional updates, complex rate limiters, atomic check-and-actThe script is long-running or loops over huge data → blocks the whole server. You need maintainable logic → keep it tinyRuns atomically on the main thread; O(script). Blocks all clients for its durationPowerful and dangerous: a Lua script is the easiest way to stall the entire server. Keep scripts short, deterministic, and O(small). Prefer Functions (named, versioned) over inline EVAL for anything you maintain.
CachingReducing load on a slower backing store, sub-ms reads on hot data, with TTL and an eviction policy (allkeys-lru)The data is the only copy → that is a primary store, set noeviction and accept Redis's durability limits, or use a real DBO(1) get/set; memory bounded by maxmemory + evictionThe original and still dominant use. The trap is forgetting the failure modes: stampede on expiry (add jitter), eviction storms near the ceiling, and treating a cache miss as an outage when it should be a fallback.
Bloom FilterProbabilistic "have I seen this?" at massive scale with tiny memory and tolerable false positives: dedup, "already shown," cache-penetration guardYou cannot tolerate false positives → use a Set (costs real memory). You need to remove items → Bloom can't delete (use Cuckoo filter)O(k) per op; bits ≈ -n·ln(p)/(ln2)², e.g. ~1.2 KB for 1000 items at 1% FPRThe right tool when "definitely not seen" is cheap and valuable and an occasional false "seen" is harmless. A Set storing 100M IDs is gigabytes; a Bloom filter is megabytes. Know it cannot delete and false-positive rate rises as it fills.
GeospatialRadius/box queries over points: "nearby drivers," "stores within 5km," proximity ranking (GEOADD/GEOSEARCH)You need polygons, routing, or true GIS → PostGIS. Millions of points with complex filters → dedicated geo DBO(log N) add, O(N+log M) radius; built on a ZSET with geohash scoresIt is a ZSET with geohash-encoded scores, which is why it is fast and why it inherits ZSET memory cost. Great for point-in-radius at moderate scale; not a GIS replacement.
Vector Sets betaNative vector similarity (KNN) inside Redis for recommendations and semantic search, when you already run Redis and want one fewer system (VADD/VSIM)You need a battle-tested, billion-vector ANN store with rich filtering → dedicated vector DB or Redis Query Engine. API stability matters → it is betaHNSW index, O(log N)-ish search; memory heavy (vectors + graph), quantization (int8) helpsNew in Redis 8 (antirez), inspired by sorted sets. Compelling for "I already have Redis, add similarity search." For serious vector workloads at scale, the Redis Query Engine or a purpose-built vector DB is the safer call until Vector Sets exit beta.
HyperLogLogApproximate distinct-count of huge sets in fixed ~12 KB: unique visitors, distinct search terms, cardinality across billions (PFADD/PFCOUNT)You need exact counts → Set or counter. You need the actual members → it stores noneO(1) add/count; fixed ~12 KB regardless of cardinality; ~0.81% standard errorMagical memory profile: count a billion uniques in 12 KB with sub-1% error. The catch people miss: it gives you a number, never the members, and merges (PFMERGE) are how you do unions. Wrong tool the instant someone asks "which users."
Geo + Streams + ZSET combo(pattern) Real-time location feed with replay: GEO for position, Stream for the event log, ZSET for rankingN/A — composition pattern, not a single featureSum of the partsThe PE move is composing primitives, not forcing one structure to do everything. Most sophisticated Redis designs are 2-3 structures wired together.

Role-level: which "Redis-as-X" claim actually holds up

Redis as…Holds up whenBreaks whenPE Verdict
CacheAlways. This is the home-turf use caseRarely; only via misconfig (no eviction policy, no TTL jitter)Unambiguous yes. Everything else is "Redis can also…"
Primary NoSQL DBWorking set fits in RAM, you accept async-replication loss windows, persistence is DR not a guarantee, and the data is not moneyYou need durability guarantees, the dataset exceeds RAM economically, or a dropped write is a correctness bugConditional. Viable for session stores, ephemeral state, feature flags. Dangerous for ledgers/orders. Say the durability caveat out loud.
Streaming engineModerate throughput, short-to-medium retention, you want consumer groups without running Kafka, latency matters more than retentionYou need long retention, replay over weeks, multi-DC at high volume, or exactly-once across a big pipelineYes for "Kafka-lite." Redis Streams covers a huge middle ground at a fraction of Kafka's operational weight. It is not Kafka at Kafka scale.
Message brokerLow-latency in-process messaging, simple routing, you accept Redis delivery semantics (Pub/Sub at-most-once, Streams at-least-once)You need complex routing, transactions across queues, or AMQP guarantees → RabbitMQ. Massive durable log → KafkaFine for lightweight messaging. For rich broker semantics (exchanges, complex routing, transactional messaging), a real broker wins.

The honest hierarchy: Redis is a cache that grew the ability to also be a fast ephemeral database, a lightweight stream engine, and a simple broker. Each "also" comes with a caveat a PE states up front, not after the incident.

06. Trade-Offs (what you give up to get what)

A trade-off is a deliberate exchange: gain X, surrender Y, and it bites at a specific moment. Grouped by the feature families you listed. These are the ones that matter on call and in design review.

What you gain What you give up
FeatureTrade-OffWhat You GainWhat You Give UpWhen It BitesPE Nuance
Pub/SubNo persistence for delivery speedLowest-latency fan-out, zero storage costAt-most-once: any disconnected/slow subscriber loses messages foreverA subscriber restarts during a deploy and silently misses every message in that windowRedis does not buffer per-subscriber. A slow consumer's output buffer fills and the server may drop it. Sharded Pub/Sub (cluster) helps fan-out but not durability.
StreamsDurability + replay for bounded memory growthAt-least-once, consumer groups, acks, replay from any offsetMemory grows until you cap it; consumers can double-process on redeliveryYou forget MAXLEN and the stream silently eats RAM until eviction or OOMAt-least-once means consumers must be idempotent. Pending Entries List (PEL) tracks unacked; you must XCLAIM abandoned messages or they sit forever.
Lists (queue)Simplicity for reliabilityTrivial O(1) queue with blocking popNo ack/retry: a crash between pop and process loses the jobWorker OOMs after BRPOP but before finishing; the job is gone with no traceBLMOVE into a processing list gives crash-safety by hand, but at that point Streams give it for free with groups and acks.
Atomic CounterLock-free correctness for single-node boundRace-free increment with no client coordinationThroughput on one key is capped by the single thread; cross-shard counters need mergingOne ultra-hot counter (viral post) becomes a single-key hot spot that cluster can't spreadShard a hot counter into N sub-keys (counter:{0..N}), sum on read. Trades read cost for write spread.
Sorted SetOrdering for memory and op costO(log N) ranked queries, range-by-score, rank lookupHigh per-member memory (skip list + dict), O(log N) not O(1)A leaderboard of 50M members balloons RAM; ZRANGE over huge ranges blocks the loopAlways bound the range you fetch. Use ZADD GT/LT for conditional score updates instead of read-then-write.
Transactions (MULTI)Isolation without locks, but no rollbackCommands run with nothing interleaved; WATCH gives optimistic CASNo atomic rollback: a command failing mid-batch does not undo prior onesCommand 2 of 4 errors; commands 1, 3, 4 still applied, leaving partial stateThis is the single most misunderstood Redis feature. Use Lua for true conditional atomicity; use WATCH for optimistic concurrency, not for rollback semantics.
Lua / FunctionsAtomic compute for blocking riskMulti-step logic runs atomically, no round trips, no race windowRuns on the main thread; a slow/looping script freezes every clientA script iterates a growing collection and p99 for the whole instance spikes to secondsScripts must be deterministic (no random/time without seeding) for replication safety. Keep them O(small). Effective lock-and-do primitive.
CachingSpeed for staleness and a new failure surfaceSub-ms reads, massive backing-store offloadStale data, cache-coherence complexity, a new dependency that can failMass-expiry stampede hammers the database; or eviction near maxmemory craters hit rateTTL jitter, request coalescing (single-flight), and treating misses as graceful fallback (not errors) separate robust caches from fragile ones.
Bloom FilterTiny memory for false positives + no deleteMembership test for 100M items in MB, not GBConfigurable false-positive rate; cannot remove items; cannot enumerateThe filter fills past its sizing and FPR climbs, silently letting dupes throughSize for peak cardinality up front; FPR is fixed at creation. Cuckoo filter if you need deletes. Never use when a false "yes" causes a correctness bug.
GeospatialFast radius for GIS limitationsO(log N) point-in-radius using geohash on a ZSETNo polygons, no routing, inherits ZSET memory costProduct asks for "inside this delivery polygon" and the geohash model can't express itIt is a ZSET underneath; you can mix geo and score ops. For anything beyond circles/boxes, you need real GIS.
Vector SetsCo-located similarity for maturity riskNative KNN in Redis, one fewer system, quantization for memoryBeta API (may break), memory-heavy index, fewer features than dedicated vector DBsYou build on the beta API and a Redis 8.x point release changes itHNSW recall/latency tunable via build params. Great for "already on Redis." Redis Query Engine is the more enterprise-grade in-Redis vector path.
HyperLogLogFixed tiny memory for approximation + no members~12 KB counts a billion uniques at ~0.81% errorApproximate only; stores no members; cannot answer "which"Someone needs the list of unique users, not the count, and the data simply isn't thereMergeable across shards/time windows via PFMERGE, which is its superpower for distributed cardinality. Wrong tool whenever exactness or membership is required.
Redis (all)In-memory speed for durability + RAM costMicrosecond ops, rich structuresWorking set must fit RAM; async replication is lossy on failoverDataset outgrows economical RAM, or a failover drops acknowledged writes for a paymentThe meta trade-off behind every feature. Speed comes from RAM and single-threading; durability and unbounded scale are what you surrender.

07. Use Cases (who runs this in production and why)

Real patterns with the specific property driving the choice. The "Why Not Alternative" column is where PE judgment lives.

Use CaseFeatureDriving PropertyScale DimensionWhy Not the Alternative
API rate limitingAtomic counter / ZSET sliding windowRace-free increment + TTL in one atomic op, sub-ms decisionMillions of checks/sec across keysA SQL row lock per request would add ms and contend; Redis decides in microseconds with no lock
Gaming leaderboardSorted SetO(log N) rank + range queries, "top 100" and "my rank" instantly10M+ players, real-time updatesSQL ORDER BY ... LIMIT with a rank window is expensive and slow at write rate; ZSET keeps it sorted incrementally
Reliable job queueStreams + consumer groupsAt-least-once delivery, acks, redelivery of crashed-worker jobs100K+ jobs/sec, multiple worker poolsLists lose in-flight jobs on crash; Kafka is heavier ops for this scale and latency target
Live notification fan-outPub/Sub (sharded in cluster)Lowest-latency broadcast to connected clients, no storageFan-out to thousands of subscribersStreams add storage you don't need for ephemeral "user is typing"; Pub/Sub is leaner when loss is fine
Session / cache storeStrings/Hashes + TTL + LRUSub-ms reads, automatic expiry, backing-store offload100K+ req/sec, sub-ms p99Hitting Postgres for every session read would not survive the QPS; this is the canonical cache
Unique visitor countingHyperLogLogBillion-cardinality distinct count in fixed 12 KBBillions of events, thousands of dimensionsA Set of all visitor IDs is gigabytes per dimension; HLL is 12 KB at <1% error
Dedup / "already processed"Bloom filterMembership test for huge ID space in MB with tolerable FPR100M+ distinct IDsA Set costs GB; a DB lookup per item adds latency; Bloom answers in microseconds in MB
Nearby search (ride-hail, delivery)GeospatialPoint-in-radius ranking in O(log N) via geohashMillions of moving points, frequent updatesPostGIS is overkill for circle queries and slower to update at this churn; geo-ZSET updates are O(log N)
Recommendations / semantic searchVector Sets / Query EngineKNN over embeddings co-located with app dataMillions of vectors, low-latency similarityA separate vector DB adds a system and a network hop; if you already run Redis, co-location wins for moderate scale
Distributed lockSET NX PX / Lua (Redlock pattern)Atomic acquire-with-expiry, single-threaded guaranteesCoordination across many app nodesZooKeeper/etcd give stronger guarantees but heavier ops; Redis locks are fast and "good enough" with care
Event sourcing / audit logStreamsAppend-only, ordered, replayable with consumer offsetsMedium-retention event historyKafka for long retention/multi-DC; Streams when retention is bounded and you want it in your existing Redis
Atomic multi-step business logicLua / FunctionsRead-compute-write with zero race window, no round tripsHigh-contention single-key logicApplication-side transactions race; MULTI can't do conditional logic; Lua runs it all atomically server-side

08. Limitations (severity, workaround, and what the workaround costs)

Honest constraints, ranked by how badly they hurt and what you pay to dodge them.

Critical High Medium
LimitationAffectsSeverityWorkaroundWorkaround Cost
Single slow command blocks all clientsLua, big-collection ops, KEYSCriticalSCAN instead of KEYS, bound ranges, UNLINK for big deletes, keep Lua O(small)Discipline + code review; SCAN is cursor-based and more complex than KEYS
Async replication loses writes on failoverReplication, durabilityCriticalWAIT, min-replicas-to-write, accept Redis isn't a system of record for moneyWAIT adds replica RTT to writes; never fully closes the loss window
Working set must fit in RAMAll featuresHighCluster to add nodes, Redis-on-Flash (enterprise), aggressive TTLs/evictionCluster ops complexity; flash trades latency; eviction means data loss by design
Pub/Sub at-most-once, no bufferingPub/SubHighSwitch to Streams for any delivery guaranteeStreams cost memory and require ack/PEL management
MULTI/EXEC has no rollbackTransactionsHighUse Lua for conditional atomic logic; validate before queueingLua is harder to maintain and can block; validation adds round trips
Fork (RDB/AOF rewrite) COW spike + main-thread stallPersistenceHighHeadroom for COW, replica-side saves, smaller instances, schedule off-peakWastes RAM headroom; replica-side saves complicate topology
Cross-shard multi-key ops unsupported in clusterCluster, MULTI, Lua, MGETHighHash tags to co-locate related keys in one slotForces key-space design up front; hot-slot risk if over-co-located
Eviction (LRU/LFU) is approximateCaching, memoryMediumRaise maxmemory-samples for better accuracyMore samples = more CPU on the hot path per eviction
Bloom filter can't delete; FPR rises as it fillsBloom filterMediumSize for peak up front; Cuckoo filter for deletes; scalable Bloom variantsOver-sizing wastes memory; Cuckoo has its own quirks
Vector Sets are beta (API may change)Vector SetsMediumPin version, use Redis Query Engine for stable vector searchQuery Engine is heavier; pinning delays upgrades
Hot key/slot cannot be spread by clusterCluster, counters, ZSETMediumSub-key sharding, client-side caching, read replicasRead-time aggregation cost; cache coherence complexity
HyperLogLog/Bloom give no members, only answersProbabilistic typesMediumKeep a separate exact structure if you ever need enumerationDefeats the memory saving that motivated the probabilistic type

09. Fault Tolerance

How Redis behaves when things break, by deployment topology. Single instance vs replicated (Sentinel) vs Cluster.

DimensionSingle InstanceReplicated (Sentinel)Cluster
Replication modelNoneAsync leader-followerAsync per-shard leader-follower
Failure detectionExternal monitoring onlySentinel quorum agreementGossip PFAIL→FAIL by node majority
Failover mechanismManual restartSentinel promotes replica, reconfiguresReplica wins master-quorum vote, takes slots
RTO (typical)Minutes (manual)Seconds to tens of secondsBounded by node-timeout + vote, seconds
RPO (typical)Last persistence point (minutes)> 0, unshipped writes lost> 0, localized to failed shard
Split-brain behaviorN/A (one node)Isolated old master serves stale writes briefly; discarded on rejoinMinority side stops serving its slots after losing quorum
Blast radius, single nodeTotal outageWrite outage during failover; reads survive on replicasOnly that shard's slots; rest of cluster serves
Cross-region failoverNoneAsync cross-region replica, lossy manual promotionEnterprise Active-Active (CRDT) or async; vanilla is single-region
Data loss scenariosCrash before persist, disk lossFailover window, fork-OOM, split-brain writesSame + slot-migration edges + hot-slot overload

10. Sharding

How Redis partitions data across nodes, and the constraints that surprise teams migrating from a single instance.

DimensionSingle / ReplicatedRedis Cluster
Sharding modelNone, single keyspaceHash slots: CRC16(key) mod 16384, fixed slot count
Shard key constraintsN/AMulti-key ops require same slot; hash tags {tag} co-locate
Rebalancing mechanismN/AReassign slots between nodes; only moved slots' keys migrate
Rebalancing cost/impactN/AReal data movement + ASK redirection latency during migration
Hot-shard behaviorSingle thread caps one keyCluster spreads keys, but a single hot key/slot still bottlenecks one node
Maximum shards (practical)1 masterHundreds of masters; gossip overhead grows with node count
Resharding without downtime?N/AYes, online; clients follow MOVED/ASK during migration
Cross-shard query supportFull (one node)Single-key always; multi-key only within a slot, else CROSSSLOT error

Why 16384 slots: the slot bitmap is gossiped in every heartbeat (2 KB), cheap enough to exchange constantly across a large cluster while still fine-grained for balancing. Decoupling key→slot→node is what lets you add a node by moving slots instead of rehashing the world.

11. Replication

The data-copy story, including the consistency knobs people forget exist.

DimensionRedis Replication
TopologyLeader-follower; replicas can have sub-replicas (replication tree) to cap master egress
Sync vs asyncAsynchronous by default; writes ack before replicas confirm
Replication factorDefault 0 (standalone); typical 1-2 replicas per master; no hard max but egress grows
Consistency level optionsWAIT n ms blocks for N replica acks; min-replicas-to-write refuses writes below a threshold; never linearizable
Replication lag (typical)Sub-ms to low-ms intra-region under normal load; spikes on big-key writes and full resync
Conflict resolutionNone needed for single-leader; Enterprise Active-Active uses CRDTs for multi-master merge
Cross-region replicationAsync replica or Active-Active (Enterprise, CRDT-based); cross-region lag is real and lossy on failover
Replication during partitionMaster keeps serving (AP-leaning); isolated writes on a minority master are discarded after failover unless min-replicas blocks them
Resync efficiencyPSYNC partial resync from the replication backlog buffer avoids full RDB transfer on brief disconnects

12. Better Usage Patterns (where PE depth shows)

The patterns most teams get wrong, the better way, and why it compounds at scale. This is the table to read twice.

PatternWhat Most Teams Do WrongThe Better WayWhy It Matters
Listing keysRun KEYS pattern* in productionUse SCAN with a cursor, or maintain an index SetKEYS is O(N) over the whole keyspace and blocks every client; one KEYS on a big DB is an instant p99 incident
MessagingReach for Pub/Sub by defaultDefault to Streams; use Pub/Sub only when loss is truly finePub/Sub silently drops on disconnect; teams discover the gap only when a consumer was down during an important event
QueuesLists + manual ack hacksStreams with consumer groups, PEL, and XCLAIM for abandoned workHand-rolled ack/retry on Lists reimplements Streams badly and loses jobs on crash
Cache expiryIdentical TTL on a batch of keysAdd random jitter to TTLs (e.g. base ± 10%)Synchronized expiry causes a thundering herd on the backing store at the TTL boundary
Cache miss handlingEvery miss hits the DB independentlySingle-flight / request coalescing so one miss populates for all waitersA hot key expiring under load sends thousands of concurrent DB queries (stampede)
Big deletesDEL bigkey on a 10M-element collectionUNLINK for lazy background freeDEL frees memory synchronously on the main thread, blocking everyone for the duration
Conditional updatesGET, decide in app, SET (race window)Lua/Functions or WATCH+MULTI for atomic check-and-setThe read-modify-write gap lets concurrent clients clobber each other; classic lost-update bug
Memory modelingOne key per data point (1M keys)Group into hashes; small ones use compact listpack encodingPer-key overhead (~80 bytes) dominates for tiny values; hashes cut memory by an order of magnitude
Hot countersSingle INCR key for a viral objectShard into N sub-counters, sum on readOne hot key can't be spread by cluster; sub-key sharding distributes the write load across slots
Persistence as durabilityAssume RDB/AOF means "we never lose writes"Treat persistence as DR/fast-restart; use replication + WAIT for availability, and a real DB for true durabilityConflating the two leads to using Redis as a system of record for money, then losing acknowledged writes on failover
Lua scriptsLong inline EVAL with loops over dataTiny deterministic Functions, named and versioned; offload heavy workA slow script freezes the whole instance; non-determinism breaks replication and AOF replay
Probabilistic typesUse HLL/Bloom then later need the actual membersDecide up front whether you'll ever need enumeration; if maybe, keep an exact structure tooHLL/Bloom discard members by design; retrofitting exactness means you lost the data you summarized

13. Advanced / Next-Gen Alternatives

Where to look when Redis hits a wall, and how mature each option is. Includes the 2024-2026 fork/competitor landscape you should know for any interview.

Production-ready Emerging Early
AlternativeWhat It ImprovesMaturityMigration CostWhen To Consider
Valkey (Linux Foundation fork)BSD license, no copyleft/commercial strings; AWS/Google/Oracle backed; wire-compatible with Redis 7.2ProductionNear-zero, drop-in for Redis 7.2-era features (no Vector Sets)You want true open-source licensing and the hyperscaler-backed path; many managed services moved here
DragonflyMulti-threaded, shared-nothing core; far higher single-node throughput and memory efficiency; fork-free snapshotsProductionDrop-in (Redis/Memcached API compatible), but verify command coverage and cluster semanticsOne node's single-core ceiling is your bottleneck and you'd otherwise run many shards on one big box just to use the cores
KeyDBMulti-threaded fork of Redis with fine-grained locking; active-replica multi-masterProductionDrop-in; lock model differs, watch atomicity edge casesYou want multi-core on a Redis-compatible engine without changing your data model
Kafka / RedpandaDurable, long-retention, high-throughput log; partitioned consumer groups at massive scaleProductionHigh: different model (offsets, partitions, brokers), new opsStreams outgrows you: weeks of retention, multi-DC, exactly-once pipelines, very high throughput
RabbitMQRich broker semantics: exchanges, complex routing, transactional messaging, AMQP guaranteesProductionHigh: full broker, different protocol and opsYou need real broker routing/guarantees that Pub/Sub and Streams don't provide
Dedicated vector DB (Milvus, Qdrant, pgvector)Billion-scale ANN, rich metadata filtering, mature tooling vs beta Vector SetsProductionMedium: new system + embedding pipeline integrationVector workload is core and large; Vector Sets' beta status or feature set is limiting
Redis Query Engine (in-core, Redis 8)Full-text + vector + numeric + geo querying with secondary indexes, all inside RedisProductionLow: same Redis, add index definitionsYou need richer querying than raw structures and want to stay in Redis; more enterprise-grade than beta Vector Sets
Redis-on-Flash (Enterprise)Tiered RAM+SSD so datasets exceed RAM economicallyProductionMedium: enterprise offering, latency trade-offWorking set is too big for pure RAM but you want Redis semantics
Active-Active CRDT (Enterprise)True multi-region multi-master with conflict-free mergeEmergingMedium-high: CRDT semantics change app assumptionsMulti-region writes with automatic conflict resolution are a hard requirement
Momento / serverless cacheZero-ops managed caching, pay-per-use, no cluster managementEmergingLow-medium: API differs from raw RedisYou want caching without operating Redis at all

Interview-relevant context: in March 2024 Redis left BSD for RSALv2/SSPLv1, triggering the Valkey fork (Linux Foundation, BSD, backed by AWS/Google/Oracle). Redis 8 (GA late 2025) added AGPLv3 as a third option and merged the former Stack modules (Search, JSON, TimeSeries, probabilistic, Vector Sets) into the open-source core. Knowing this saga signals you track the ecosystem, not just the API.