DynamoDB vs Cassandra vs PostgreSQL

Three databases that solve different problems and get mistakenly compared because they all store rows. Deep research edition.

Managed NoSQL KV Self-Managed Wide-Column Relational OLTP

As of 2026-05-16. Framing: "When would you pick each." Grounded in 2024-2025 post-mortems and recent benchmarks.

PE Verdict

Pick Postgres unless you have a specific reason not to. Pick DynamoDB when you want zero ops and your access patterns are known and frozen, accepting AWS regional dependency risk. Pick Cassandra only when you genuinely need active-active multi-region writes and have the ops headcount to operate it (most teams that pick Cassandra in 2026 should have picked ScyllaDB or DynamoDB instead, as Discord's 177-to-72 node migration demonstrated).

Headline Benchmark Numbers (2024-2025 Independent Tests)

DynamoDB: 4K-40K ops/sec range (highly variable by access pattern), single-digit ms p99 in steady state, ~$1.25 per million requests at typical sizing
Cassandra: 80K-106K ops/sec on mixed workloads (3-node baseline), p99 reads 40-125ms historical (Discord), p99 writes 5-70ms historical
PostgreSQL: ~16K ops/sec write-heavy OLTP single-node, ~1.8x MySQL on writes; modern hardware pushes 100K+ TPS on well-tuned r6i.32xlarge

Best default choices

PostgreSQLDefault OLTP choice, SQL flexibility, ACID, extensions DynamoDBAWS-native key-value scale with frozen access patterns CassandraActive-active multi-region writes with serious ops ownership

1. Trade-Offs

DynamoDB

Use when access patterns are known, low-latency key-value lookups matter, AWS-native operations are preferred, and zero database ops is worth the modeling constraints.

#	Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
01	Partition key drives all access patterns	Predictable sub-10ms p99 reads at any scale, no query planner surprises	Cannot do ad-hoc queries, every new access pattern requires a new GSI or table	PM asks for a new query shape post-launch and the answer is "we need a GSI or rebuild the table"	Single-table design with overloaded composite keys is the escape, but most teams discover it after building 5 separate tables and now need a migration.
02	3000 RCU / 1000 WCU per-partition ceiling	Forces good key design, prevents one workload from starving the table	Hot key kills you even if table-level capacity is unused	Viral content concentrates all traffic on one partition key, you throttle at 5% of provisioned capacityAWS docs: adaptive capacity and split-for-heat take 5-10min to kick in; dropped writes during the lag window	The 1000 WCU per-partition limit applies even in on-demand mode and is not raisable. Pre-shard via write-suffix pattern before viral moments, not during.
03	Eventual consistency by default, strong reads cost 2x	Cheap reads at scale, ~5ms p99 with eventual	Read-after-write surprises if you forget the ConsistentRead flag, plus 2x cost when you need it	User updates profile, refreshes page, sees stale data, files a bug. You add ConsistentRead everywhere and the bill doubles.	GSIs are always eventually consistent, no flag will fix that. If you need read-after-write on a secondary access pattern, you need to query the base table.
04	On-demand pricing vs provisioned	Zero capacity planning, scales to any spike	Roughly 7x the cost of saturated provisioned capacity	Workload becomes steady-state and you're paying 7x what you could be. Or stays spiky and provisioned would have throttled.	The right answer is usually provisioned + auto-scaling once you know your traffic shape. On-demand is the right starting default and the wrong long-term choice.
05	400KB item size hard limit	Forces externalization of blobs, keeps per-partition storage manageable	Cannot store medium-sized documents, must split or offload to S3	Product launches a feature that stores a 1MB JSON blob and you're now juggling two-system consistency between DDB and S3	Most teams paper over this with "store S3 key in DDB" but never solve the consistency problem (S3 write succeeds, DDB write fails, orphan blob). Build the cleanup job before you ship.
06	No native joins, no cross-table transactions across regions	Predictable latency, no query planner to explain when things go slow	Joins become application-level multi-get, transactions across Global Tables require app coordination	Compliance asks for a report across users + orders + refunds, and now you're scanning three tables and merging in Lambda	TransactWriteItems works within a single region and table, up to 100 items. Cross-region needs sagas. This is the silent gotcha that derails most "let's go multi-region" projects.
07	Vendor lock-in plus implicit regional dependency on us-east-1	Tight integration with Lambda, IAM, CloudWatch, no ops burden	Cannot move without rewriting data access layer; AWS control plane dependencies cascade through DDB	Oct 19-20, 2025: DNS race condition in DDB us-east-1 took down DDB regionally, then cascaded to EC2 (DWFM lease failures), Lambda, NLB. 15 hours of degraded service.Source: AWS post-mortem, Oct 2025. Race between DNS Planner and DNS Enactor wrote empty DNS record for dynamodb.us-east-1.amazonaws.com	"Fully managed" doesn't mean "no failure modes you need to plan for." Most teams treat DDB as a black box and have zero runbook for regional DDB unavailability. Build the multi-region failover plan even if you don't need it today.
08	Streams provide CDC, but with 24h retention	Native change capture into Lambda, Kinesis, EventBridge	If your consumer falls behind 24 hours, data is gone	Lambda consumer fails silently overnight, you discover Monday morning and the changes are unrecoverable	Use Kinesis Data Streams sink (longer retention) for anything important. The 24h limit is fine for triggers but never for critical pipelines.

Cassandra

Choose only when you need active-active multi-region writes, massive write scale, tunable consistency, and you have the operational depth to own compaction, repair, and clock-skew risk.

#	Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
01	Tunable consistency per operation (ONE / QUORUM / ALL / LOCAL_*)	Latency and consistency become per-query knobs, not architectural commitments	Every developer must understand R+W>RF formula, every team gets it wrong at least once	Reads default to ONE, writes default to ONE, team thinks they have consistency, race conditions appear in production	QUORUM/QUORUM is the right default for most workloads. LOCAL_QUORUM for multi-DC. Anyone who codes against CL=ONE for writes is shipping a bug.
02	Peer-to-peer with no leader, every node accepts writes	No failover step, no leader election lag, true active-active multi-DC	Last-writer-wins via timestamps, silent data loss on clock skew	NTP drifts on one node, writes from that node lose every conflict, days pass before someone notices	This is the single biggest operational risk in Cassandra and it's not in any vendor pitch. Monitor clock skew like your job depends on it, because it does.
03	LSM tree storage, write-optimized via commit log + memtable	Sub-ms writes, massive write throughput per node	Read amplification (multiple SSTables), compaction is a permanent ops concern	Compaction falls behind during ingest spikes, reads slow down, eventually disks fill up	Compaction strategy choice (STCS vs LCS vs TWCS) is one of the highest-leverage decisions and most teams pick wrong. Time-series should be TWCS, no exceptions.
04	JVM-based runtime with garbage collection	Mature JVM ecosystem, broad tooling, well-understood operational model	GC pauses create p99 tail latency variance; "super long consecutive GC pauses" require manual node reboot	Discord 2022: p99 read latency 40-125ms, p99 writes 5-70ms, frequent on-call paging for GC-induced cluster instabilityDiscord migrated 177-node Cassandra cluster to 72-node ScyllaDB cluster (C++, no GC); p99 reads improved to 15ms, p99 writes to 5ms	If you're picking Cassandra in 2026 over ScyllaDB, you should have a specific reason (ecosystem maturity, existing skills, multi-cloud) because the GC-pause tail latency tax is real and operationally expensive.
05	Tombstones for deletes (gravestone records)	Distributed deletes work eventually-consistently without coordination	Reads must scan tombstones until compaction reaps them (gc_grace_seconds, default 10 days)	Queue-like workload (insert, read, delete) tombstones reads to 10x slower as old tombstones pile up	Cassandra is famously the wrong choice for queue patterns. If your workload has a high delete:insert ratio, you're going to have a bad time, regardless of tuning. Discord's migration was held up by tombstone-heavy token ranges in the last 0.0001%.
06	CQL looks like SQL but isn't	Familiar surface for relational developers, fast onboarding	Familiarity is a trap, developers write Cassandra anti-patterns thinking they're writing SQL	JOIN-less, GROUP BY-less, the first cross-table query someone tries fails and they redesign the schema	L7-level red flag in interviews: a candidate who treats CQL as SQL with limitations. Real signal: candidate who treats it as a key-value API with a SQL-shaped wrapper.
07	Operational complexity is high (repair, compaction, GC tuning)	Total control over performance, no managed-service ceiling	Needs 2-3 dedicated SREs minimum at production scale	You hit a JVM GC pause issue at 50TB and the team has nobody who knows the JVM well enoughDiscord described their Cassandra ops as "high-toil" with "unpredictable latency and frequent on-call incidents" before migrating	DataStax Astra or ScyllaDB Cloud removes most of this. If you're picking Cassandra in 2026 self-hosted, you're picking a 3-engineer operational commitment.
08	Lightweight transactions (LWT) via Paxos	Compare-and-set semantics on top of an eventually-consistent store	4 round trips, ~10x slower than regular writes, scoped to single partition	Developer sprinkles LWT for "safety", throughput drops 80%, cluster CPU pegs at 100%	LWT is a tactical escape valve, not a strategy. Cassandra 5.x ships Accord (paxos-derived strict-serializable transactions); watch this space if you've been blocked by transactions.

PostgreSQL

Default to Postgres for most product and OLTP systems where relational modeling, ACID transactions, SQL flexibility, extensions, and operational familiarity matter.

#	Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
01	Single-leader replication, vertical scaling primary	Strong consistency, ACID transactions, no clock-skew failure modes	Write throughput capped by single primary, eventual cliff requires sharding	Write QPS approaches 30K and you're researching Citus on a deadline	Modern hardware pushes the single-node ceiling much higher than most teams realize (100K+ TPS on a properly tuned r6i.32xlarge with NVMe). Don't shard until you've actually tuned.
02	MVCC via row versions, no read locks	Readers never block writers, writers never block readers	VACUUM is permanent ops concern; XID wraparound is a documented production-killing failure mode	Mailchimp/Mandrill 2016: autovacuum fell behind on a busy shard, XID wraparound protection halted writes, ~40 hours of outage. Sentry 2024: same root cause, emergency manual vacuuming required.Postgres uses a 32-bit XID counter; database refuses writes at ~2.1B transactions to prevent corruption. Alert at 1.5B XID age.	XID wraparound is the most under-appreciated failure mode in Postgres. It develops over weeks with no visible symptoms then takes the database offline completely. Set up the alerts before you need them.
03	Streaming replication (physical) + logical replication	Read replicas for scale, logical for selective replication and major version upgrades	Physical replicas read-only and version-locked, logical has known caveats (DDL, sequences, large transactions)	Logical replication breaks on a schema change and you're debugging replication slot growth at 3am	For major version upgrades, logical replication is the only zero-downtime path and it's still terrifying. Practice the cutover on a copy first.
04	Connection-per-process model	Strong isolation, easy debugging per session	Each connection costs ~10MB RAM, max_connections becomes a hard ceiling at ~500-1000	Microservices proliferate, each maintains its own pool, you hit the connection wall	PgBouncer in transaction mode is mandatory for any non-trivial deployment. Without it you're capping your architecture at a few hundred concurrent backend connections.
05	Native partitioning, native sharding via Citus extension	Tiered scaling story: partition first, shard later when needed	Sharding is a retrofit, query planner has no native distributed-query concept	Cross-shard JOIN performance degrades to scatter-gather, application has to be sharding-aware	Citus works when data partitions naturally on a tenant_id-like key and 95% of queries are tenant-scoped. It breaks on global aggregations. Plan the data model for sharding from day one if you suspect you'll need it.
06	Extension ecosystem (PostGIS, pgvector, TimescaleDB, pg_partman)	One database for OLTP, GIS, vector search, time-series, queues	Each extension is a maintenance commitment, version-coupling across extensions	Postgres major upgrade requires waiting for 4 extensions to support the new version	Extensions are Postgres's superpower and its hidden cost. The "use one tool" gain is real until you have 6 extensions, then upgrade cadence becomes the bottleneck.
07	Synchronous replication available but optional	Trade durability for latency on a per-transaction basis	Default async means a failover can lose committed transactions	Primary AZ fails, replica promoted, 30 seconds of writes are gone, audit team is unhappy	`synchronous_commit=remote_apply` is the strongest setting and costs 2-5ms latency per write. For financial workloads it's non-negotiable. Most teams don't even know it exists.
08	Index proliferation is cheap, except when it isn't	Add indexes liberally to speed up queries	Each index slows down writes, increases bloat, fights for buffer cache	Write performance silently degrades as the indexes-per-table count creeps to 12+	Use `pg_stat_user_indexes` to find unused indexes quarterly. Most production systems carry 30%+ dead-weight indexes that nobody dares to drop because "they might be used."

2. Use Cases

DynamoDB

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
Shopping cart and session store	Amazon retail (origin use case)	Sub-10ms p99 with zero ops overhead across regions	Trillions of items, peak ~89M req/sec during Prime Day	Cassandra would need a dedicated SRE team; Postgres would not scale writes past single primary
Ad-tech bid storage and counters	Mid-size DSP, 200K QPS sustained	Predictable p99 under 15ms, no GC pauses	50TB hot data, 3M writes/sec spike during traffic peaks	Cassandra GC tail latency at 99.9% would breach the 100ms bid-window SLO; Postgres caps below 1M writes/sec
Mobile app backend (user profiles, prefs, sync state)	Snapchat-style, Tinder-style apps	Per-user lookup latency, global tables for region affinity	100M+ MAU, ~5K QPS per region	Postgres single-leader latency cross-region would force read replicas plus app-layer routing
IoT device state and telemetry ingest	Fleet management, ~1M devices reporting every 30s	High write throughput with auto-scaling, TTL for retention	~33K writes/sec sustained, time-series data with 90-day TTL	Postgres TimescaleDB cheaper at low scale but doesn't auto-scale through traffic spikes
Serverless event-sourced workload	Lambda + DDB Streams pattern	Native Lambda trigger, IAM integration, zero connection-pool concerns	~10K events/sec with downstream fanout	Postgres would require RDS Proxy and connection limits would still cap concurrency
Gaming leaderboards and player state	Mobile games, competitive titles	Single-digit ms read on per-player profile lookup	50M players, 200K concurrent at peak	Redis loses persistence guarantees; Cassandra adds operational tax for no latency gain

Cassandra

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
Time-series metrics and observability storage	Netflix Atlas (1 trillion requests/day on Cassandra)	Massive write throughput with linear horizontal scaling	Billions of writes/day, multi-PB datasets	Postgres TimescaleDB caps below 1M inserts/sec; DDB cost prohibitive at this scale (~$1.25/M requests vs ~$0.0001/M on Cassandra hardware)
Multi-region active-active write workload	Global IoT platforms, sync-heavy mobile apps	Local-DC writes with no failover, async cross-DC convergence	10+ regions, sub-50ms write latency in each	DDB Global Tables exist but lose to Cassandra on per-key conflict resolution flexibility; Postgres has no native active-active story
User activity / event log storage (historical context)	Discord (177 nodes at peak before ScyllaDB migration)	Append-heavy workload, channel-partitioned by Snowflake ID	Trillions of messages, ~177 Cassandra nodes pre-migration	Postgres write ceiling; DDB cost; though Discord ultimately concluded ScyllaDB was the better trade-off
Recommendation system feature store	Spotify-style music streaming, news ranking	Wide-row reads (all features for a user) with low-latency lookup	500M users, 1000+ features per user, 10K QPS	Postgres row width and per-query latency wouldn't sustain it; DDB item-size limit (400KB) kills the wide-row pattern
Fraud detection feature aggregation	Payment processors, large e-commerce	Real-time aggregation across time windows with eventual consistency tolerance	Tens of TB, sub-100ms feature retrieval	Redis loses persistence at scale; DDB throughput limits push cost too high

PostgreSQL

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
Transactional system of record	Most B2B SaaS (Stripe hybrid, Notion, Linear)	ACID transactions, mature ecosystem, schema flexibility	10s of TB, sub-10K TPS sustained (well within single-node ceiling)	DDB lacks cross-table transactions across regions; Cassandra has no real ACID story
Multi-tenant SaaS with tenant isolation	Notion, Linear, Heroku, Citus customers	Per-tenant query patterns, schema-per-tenant or RLS isolation	10K-100K tenants, varied query shapes per tenant	NoSQL forces denormalization that breaks tenant-isolated reporting; Citus extends Postgres for sharding when needed
Geospatial workloads	Uber (early), delivery platforms, ride-sharing	PostGIS extension provides production-grade spatial indexing	Millions of geo-queries per minute	DDB lacks spatial primitives; Cassandra has no geo-index that's production-credible
Vector search for AI applications	RAG pipelines, embedding storage, Pinecone alternatives	pgvector provides HNSW indexes, mature SQL surrounds it	10M-100M embeddings with metadata filtering	Dedicated vector DBs are faster but force a second store and a sync problem
Analytics-adjacent OLTP	Internal admin tools, reporting backends	Window functions, CTEs, JSONB, full SQL for ad-hoc analysis	1-10TB, complex queries	Cassandra and DDB cannot do ad-hoc aggregation; warehouse is overkill for this scale
Job queue / task scheduling	Hatchet, Trigger.dev, Sidekiq alternatives	`SELECT ... FOR UPDATE SKIP LOCKED` makes Postgres a credible queue, single source of truth	10K-100K jobs/sec with retries and dead-letter	Dedicated brokers (Redis, SQS) lose the durability and queryability of Postgres-backed queues

3. Limitations

Limitation	DynamoDB	Cassandra	PostgreSQL
Max item / row size	400KB hard limit High	2GB partition recommended, ~100MB row practical Med	1.6TB row, 1GB per field practical Med
Per-key throughput ceiling	3000 RCU / 1000 WCU per partition (not raisable, even on-demand) Critical	Per-node ceiling depends on hardware, no logical per-key cap Med	Single writer per row, no logical cap (single-node ceiling caps writes) Med
Query flexibility	Only by partition key + sort key range Critical	Only by partition key + clustering key range, no joins High	Full SQL, only limited by query planner Med
Cross-key transactions	TransactWriteItems up to 100 items, single region only High	No multi-partition transactions; LWT single-partition; Accord coming in 5.x Critical	Full ACID across the entire database Med
Schema evolution	Schema-on-read, but GSI changes are slow Med	ALTER TABLE works but can be expensive at scale High	Online DDL works, some operations still lock Med
Backup and restore	PITR up to 35 days, full restore is slow Med	Snapshot-based, ops-heavy, no PITR native High	WAL-based PITR, mature tooling Med
Multi-region write story	Global Tables, LWW conflict resolution Med	Native active-active multi-DC Med	No native active-active, requires BDR or external tools Critical
Cost predictability at scale	Spike-driven cost cliffs on on-demand (~$1.25/M requests) High	Hardware + ops headcount fixed cost (~$0.0001/M on equivalent hw) Med	Compute + storage fixed cost Med
Connection scaling	HTTP-based, no connection pool needed Med	Native driver pool, ~10K connections per node Med	Process-per-connection caps at ~500-1000 without PgBouncer High
Documented catastrophic failure mode	Regional DNS race condition (Oct 2025 us-east-1, ~15h outage) Critical	Clock skew silent data loss; GC pauses requiring manual node reboot High	XID wraparound (Sentry 2024, Mailchimp 2016 ~40h outage) Critical

4. Fault Tolerance

Post-Mortem Grounding Cells reflect documented failure modes from AWS Oct 2025 post-mortem (DDB DNS race), Discord 2022 migration (Cassandra GC + tail latency), and Sentry 2024 / Mailchimp 2016 (Postgres XID wraparound). These are the failure paths PE-level interviews actually probe.

Dimension	DynamoDB	Cassandra	PostgreSQL
Replication model	3x sync across 3 AZs, leader-based, automatic	Tunable RF (typically 3), peer-to-peer, no leader	Single primary, async streaming replicas (sync optional)
Failure detection	AWS control plane, sub-30s (when control plane is healthy)	Gossip protocol, ~10-30s detection	External (Patroni, repmgr, RDS), 30-60s typical
Failover mechanism	Automatic, transparent to client	No failover needed (no leader), client retries to next replica	External orchestrator promotes a replica, DNS/proxy update
RTO (typical)	Sub-second to client, ~60s control-plane recovery (but see Oct 2025: ~3h DDB recovery, ~15h full ecosystem)	Sub-second (just retry to another coordinator)	30-120s for managed (RDS), 5-30s for tuned Patroni setups
RPO (typical)	0 for AZ failure (sync replication)	0 with CL=ALL writes, otherwise potential loss on coordinator failure pre-replication	0 with synchronous_commit, seconds with async (default)
Split-brain behavior	Prevented by AWS control plane quorum	Possible during partition; LWW resolves on heal, can lose data via clock skew	Possible if failover is misconfigured; STONITH or fencing required
Blast radius of single-node failure	Single partition unavailable ~30-60s, no data loss	~RF replicas absorb load, no client-visible impact at QUORUM	If primary: full write unavailability until promotion; if replica: degraded read capacity
Regional / control-plane failure	Documented: Oct 2025 DNS race condition took down DDB us-east-1 for ~3h, cascaded to EC2/Lambda/NLB for ~15h total	Multi-DC native; LOCAL_QUORUM keeps regional impact local	No native multi-region story; depends on tooling and orchestrator design
Cross-region failover story	Global Tables, active-active, no failover needed	Multi-DC native, LOCAL_QUORUM keeps reads/writes local	No native active-active; requires logical replication or BDR
Silent data loss vector	Operator error on Global Tables (LWW resolution)	Clock skew + LWW: drifting NTP loses every write conflict, undetected for days	XID wraparound: dev-week-scale silent buildup, then total write outage (Mailchimp ~40h)

5. Sharding

Dimension	DynamoDB	Cassandra	PostgreSQL
Sharding model	Hash on partition key, automatic, hidden from user	Consistent hash with vnodes (~256 per node), automatic	Native: none. Citus: hash on distribution column. Native partitioning: range/list/hash on partition key (single-node)
Shard key constraints	Set at write, immutable, max 2KB	Set at write, immutable, designed into table DDL	Citus: chosen at table-create, immutable; partition key: any column
Rebalancing mechanism	Automatic split-for-heat and split-for-size, transparent	Manual or via repair tools, node add/remove triggers rebalance	Citus: shard_rebalancer extension, uses logical replication; native partitioning: manual
Rebalancing cost / impact	Transparent, no client-visible impact (when control plane works)	Network-heavy, can take days for large clusters, throttle-able	Citus rebalance can run online via logical replication; native partition move is manual
Hot-shard behavior	Split-for-heat after 5-10 min sustained load (proven lag in flash spikes); pre-shard via write-suffix	Hot partition saturates one replica set (Discord's exact problem: channel-partitioned messages with skew); manual key redesign required	Hot shard saturates a worker node; queries can be redistributed via Citus rebalancer or app-level
Maximum shards (practical)	Effectively unlimited, partitions scale with table size and throughput	1000+ vnodes per cluster easily; clusters up to thousands of nodes documented (Netflix runs 1T req/day)	Citus: hundreds of shards comfortable; tens of thousands possible but operationally heavy
Resharding without downtime?	Automatic, always online	Yes via repair + decommission, but operationally complex	Citus: yes via logical replication. Native: no, requires app-level migration
Cross-shard query support	Scan (expensive, full-table) or GSI (with its own sharding)	Coordinator scatter-gather, but anti-pattern; design tables to avoid	Citus: parallel query planner does scatter-gather for SELECT; complex JOINs need co-location

6. Replication

Dimension	DynamoDB	Cassandra	PostgreSQL
Replication topology	Leader-follower per partition, single leader	Leaderless (Dynamo paper lineage), peer-to-peer	Single primary, cascading replicas supported
Sync vs async	Sync within region (3 AZs), async cross-region (Global Tables)	Depends on consistency level: sync at QUORUM/ALL, async at ONE	Async by default; sync, remote_apply, remote_write configurable per transaction
Replication factor	3, not configurable	Configurable, typically 3 per DC; per-keyspace setting	Configurable number of replicas, no theoretical cap
Consistency level options	Eventually consistent (default), strongly consistent (2x cost, in-region only, not on GSIs)	ONE / QUORUM / LOCAL_QUORUM / ALL / SERIAL (LWT) — tunable per query	Read-your-writes via primary; replicas have configurable max staleness
Replication lag (typical)	Single-digit ms within region, ~1s cross-region (Global Tables)	Sub-second within DC, 100ms-seconds cross-DC depending on network	10-100ms in-region async; less than 1ms sync remote_write
Conflict resolution	Last-writer-wins by timestamp (Global Tables)	Last-writer-wins by timestamp; vulnerable to clock skew silent loss	No multi-master, so no conflict resolution; BDR extension adds LWW with column-level resolution
Cross-region replication	Global Tables: active-active, eventually consistent	Native multi-DC with NetworkTopologyStrategy	Logical replication across regions; no native active-active
Replication during partition	Minority AZ writes fail, majority continues	Both sides accept writes at CL=ONE; conflicts resolved on heal (LWW)	Async: writes continue on primary, replica falls behind. Sync: writes block
Replication failure modes	Control-plane DNS dependency (Oct 2025 incident)	Hinted handoff overflow under sustained replica down; tombstone tsunami on repair	Replication slot growth (logical) fills disk; physical replica fall-behind

7. Better Usage Patterns

DynamoDB

Pattern	What Most Teams Do Wrong	The Better Way	Why It Matters
Single-table design	Create one table per entity (Users, Orders, Products) like a relational schema	One table with overloaded PK/SK and entity-type prefix (USER#123, ORDER#456)	Co-located access patterns become single queries; new access patterns become key-design decisions, not migrations
Write sharding for hot keys	Use the natural key and hope adaptive capacity catches up (it won't, in time)	Suffix the write key with a random 1-N shard (user_id#shard_0..N), aggregate on read	Pre-shards distribute load before adaptive capacity has a chance to throttle; eliminates flash-spike risk (the 5-10min adaptive capacity lag is a documented kill window)
GSI projection design	Project ALL on every GSI "just in case"	Project only attributes the query uses, plus the keys	GSIs store and bill independently; ALL projections double or triple your storage and write cost
Provisioned + auto-scaling vs on-demand	Stay on on-demand forever because "it just works"	Move to provisioned with auto-scaling once traffic is predictable (after ~3 months)	Provisioned saturated is 6-7x cheaper than on-demand for the same throughput; auto-scaling handles the variance
Conditional writes for idempotency	Use UpdateItem without conditions, rely on retries being idempotent at the app layer	ConditionExpression on every retryable write (attribute_not_exists, version-based)	Prevents the double-write bug that appears once a year in production and ruins your week
Multi-region failure planning	Treat managed = invulnerable, no runbook for regional DDB outage	Build read-failover to a second region via Global Tables; explicit chaos test against regional control-plane failure	Oct 2025 us-east-1 outage was a wake-up call. Most DDB customers had no plan and were down for hours waiting for AWS

Cassandra

Pattern	What Most Teams Do Wrong	The Better Way	Why It Matters
Query-first table design	Design one normalized schema, then try to query it different ways	One table per query pattern, denormalize ruthlessly, write to all tables on update	Cassandra has no query optimizer; the table layout IS the query plan. Wrong table design means linear scans or impossible queries
Compaction strategy selection	Use default STCS for everything	STCS for general, LCS for read-heavy, TWCS for time-series with TTL	Wrong strategy causes 10x read amplification or storage bloat; TWCS for time-series is the difference between working and not
Clock synchronization	Trust the OS NTP defaults	Run chrony or dedicated NTP with sub-ms accuracy, monitor drift as an SLO	LWW conflict resolution silently loses writes when clocks drift; this is the #1 silent data loss vector in Cassandra
Consistency level selection	Use ONE/ONE for performance "because Cassandra is eventually consistent"	QUORUM/QUORUM as default, LOCAL_QUORUM in multi-DC, drop to ONE only with explicit reason	R+W>RF is the correctness threshold; teams that don't enforce it ship race conditions to production
Tombstone management	Use Cassandra as a queue (insert, read, delete pattern)	Use TTL for transient data, design tables append-only where possible, monitor tombstone count	Tombstone scans degrade reads exponentially; Discord's migration was held up at 99.9999% by exactly this issue
Request coalescing for hot partitions	Let multiple users requesting the same data hit the database independently	Add a request coalescing layer (Discord built theirs in Rust) — many users for the same key share one DB read	Hot partition reads multiplied by user concurrency is what killed Discord's Cassandra cluster; coalescing reduces effective fanout

PostgreSQL

Pattern	What Most Teams Do Wrong	The Better Way	Why It Matters
Connection pooling	Each app/service manages its own pool, hit max_connections wall	PgBouncer in transaction mode in front of every Postgres deployment	Postgres caps at ~500-1000 connections without it; PgBouncer multiplexes thousands of clients onto tens of backend connections
VACUUM and autovacuum tuning	Leave autovacuum at defaults, blame Postgres when things slow down	Tune autovacuum_vacuum_scale_factor per-table for high-churn tables; monitor pg_stat_user_tables.n_dead_tup and XID age aggressively	Default autovacuum settings are conservative; high-churn tables accumulate dead tuples faster than vacuum runs. Mailchimp's ~40h outage and Sentry's emergency vacuuming both traced to this pattern
XID wraparound monitoring	Trust autovacuum to handle it, alert on slow queries instead	Alert on `SELECT max(age(datfrozenxid))` exceeding 1.5B (well before the 2B emergency threshold)	XID wraparound develops over days/weeks with zero symptoms then halts all writes. This single alert prevents the worst Postgres failure mode
Index strategy	Add indexes for every slow query, never remove any	Quarterly `pg_stat_user_indexes` audit, drop indexes with zero scans, use partial indexes for selective conditions	Each unused index slows down every write and bloats the buffer cache; typical prod systems carry 30%+ dead-weight indexes
Long-running transactions	Wrap big batch jobs in a single transaction "for atomicity"	Chunk into smaller transactions with checkpoint table; use advisory locks for coordination	Long transactions block VACUUM from cleaning dead tuples, cause bloat to spiral, accelerate XID age. Duffel's 2021 outage traced to exactly this pattern (DDL statements without timeouts)
Replica usage for reads	Send all reads to primary "for consistency"	Route reads to replicas with explicit staleness tolerance; primary handles writes and strongly-consistent reads	Read-replica routing typically removes 60-80% of primary load; teams that don't do this scale vertically until they can't

8. Advanced / Next-Gen Alternatives

DynamoDB

Successor / Alternative	What It Improves	Maturity	Migration Cost	When To Consider
ScyllaDB / ScyllaDB Cloud (Alternator)	Wire-compatible DDB API at lower cost, no per-partition WCU ceiling, multi-cloud	Production at scale	Low for basic API, high if using Streams or Transactions	Cost at scale becomes prohibitive ($50K+/mo), regional dependency risk became unacceptable post-Oct-2025, or multi-cloud mandate
Aurora DSQL	Strongly consistent multi-region writes with SQL surface and ACID	GA but young	High — full schema rewrite, SQL access patterns	You need multi-region writes AND ACID transactions, and can wait out the maturity curve
FoundationDB	Strict serializability with multi-key transactions across the entire dataset	Production (Apple iCloud, Snowflake)	Very high — different data model, requires layer design	Correctness ceiling matters more than ops cost (financial, ledger, regulated workloads)
TigerBeetle (for financial)	Strict serializable double-entry accounting at 1M+ TPS	Early production	Very high — purpose-built, not a general database	You're building a ledger / payment system and DDB's eventual consistency on indexes becomes a correctness problem

Cassandra

Successor / Alternative	What It Improves	Maturity	Migration Cost	When To Consider
ScyllaDB	Same data model and CQL, 3-10x throughput per node, no JVM GC pauses (Discord: 177→72 nodes, p99 reads 40-125ms→15ms)	Production at scale (Discord, Numberly)	Low — wire-compatible with Cassandra driver	Tail latency or operational cost of Cassandra has become unacceptable, you want to keep the data model. This is the default modern answer for self-hosted wide-column.
Cassandra 5.x with Accord (transactions)	Native multi-partition strict-serializable transactions via Accord protocol	Emerging (5.x GA, Accord ongoing)	None if you're already on Cassandra	You want Cassandra's scale but have been blocked by lack of multi-partition transactions
YugabyteDB / CockroachDB	Strongly consistent distributed SQL with global transactions	Production	High — different query model (SQL vs CQL), strong vs eventual consistency reshapes app logic	You need horizontal scale AND ACID AND SQL — Cassandra was the wrong original choice
DynamoDB (managed)	Removes the entire ops burden (compaction, repair, GC tuning)	Production	High — different API, vendor lock-in, regional dependency risk	You picked Cassandra for scale but the 3-SRE ops cost exceeds the value; you're already AWS-native and accept the trade-off

PostgreSQL

Successor / Alternative	What It Improves	Maturity	Migration Cost	When To Consider
Citus (extension)	Horizontal sharding for Postgres, keeps SQL surface	Production at scale	Medium — requires distribution-key design, some query rewrites	Single-primary write ceiling reached, data partitions naturally by tenant or workspace
Neon / Aurora serverless	Separates compute from storage, instant branching, auto-scale	Production	Low — wire-compatible Postgres	Variable workload, dev/prod parity matters, want to scale compute independent of storage
CockroachDB / YugabyteDB	Horizontal scale with strong consistency, Postgres-wire-compatible	Production	Medium — wire compat but different operational model and some SQL features differ	Need active-active multi-region with strong consistency, Postgres can't provide it natively
OrioleDB / Postgres 17+ undo-log storage	Eliminates VACUUM via undo-log MVCC, reduces bloat, eliminates XID wraparound risk	Emerging	Low at maturity — Postgres-compatible	Watch this space; if it lands stable, the entire VACUUM tuning burden becomes optional, not mandatory. This is the most exciting Postgres direction in a decade.

Best default choices

Search and compare

1. Trade-Offs

DynamoDB

Cassandra

PostgreSQL

2. Use Cases

DynamoDB

Cassandra

PostgreSQL

3. Limitations

4. Fault Tolerance

5. Sharding

6. Replication

7. Better Usage Patterns

DynamoDB

Cassandra

PostgreSQL

8. Advanced / Next-Gen Alternatives

DynamoDB

Cassandra

PostgreSQL