Collaborative State Sync — PE Trade-Off Analysis

Five approaches to convergence under concurrent edits: CRDTs, OT, Differential Sync, Event Sourcing, and Three-Way Merge.

Distributed State Convergence

As of 2026-06-01

PE Verdict

There is no winner across the board. CRDTs win local-first and offline-first because they don't need a server to converge; OT wins server-mediated rich text because it produces tighter intent preservation than CRDTs at the cost of central authority; LWW (a degenerate CRDT) wins property-level schemas like Figma's; Event Sourcing wins audit/replay/time-travel because state is derived; Three-Way Merge wins async branching because it surfaces conflicts to humans instead of resolving them silently. The PE skill is naming which property of the workload selects the algorithm before you write a line of code, because retrofitting is a 6-month rewrite, not a refactor.

Best default choices

CRDT
Conflict-free Replicated Data Type. State-based (CvRDT) merges via idempotent join; op-based (CmRDT) propagates commutative operations. Strong eventual consistency without coordination.
OT
Operational Transformation. Concurrent operations transformed against each other so they produce the same final state regardless of arrival order. Requires central server in practice (TP2 is hard).
Differential Sync
Neil Fraser's algorithm: each peer keeps a shadow copy, diffs local edits against the shadow, patches arrive via fuzzy patch application. Powered Google Wave and original Docs.
Event Sourcing
Append-only log of immutable events; current state is a fold over the log. Conflict resolution is a projection/application concern, not a storage concern.
Three-Way Merge
Git-style merge using common ancestor: changes from base→A and base→B are combined; structural overlap surfaces as conflict for human resolution.

1. Trade-Offs

One table per technology. Each row is a real "give up X to get Y" decision. Sort columns by clicking headers.

CRDTs

Use when offline edits and multi-master convergence matter more than strict global invariants or minimal metadata.

Math-first: convergence is a property of the data type, not the network.

Trade-OffWhat You GainWhat You Give UpWhen It Bites YouPE Nuance
Commutative ops for no central authority Peers converge without a coordinator. Works offline, P2P, multi-master across regions. Every op carries causal metadata (vector clock, Lamport ID, or position identifier). Payload bloats. A 50KB doc with 100K edits balloons to MB of CRDT state. Yjs is the lightest, still pays this. The metadata-to-content ratio is the dirty secret. Yjs compresses it brilliantly via varint encoding; Automerge 1.x didn't and got destroyed in benchmarks. Always benchmark with realistic edit history, not initial state.
Tombstones for delete safety A concurrent insert "before" a deleted element resolves cleanly. Convergence guaranteed. Deleted content stays in the data structure forever (or until GC with quiescence guarantees). User pastes and deletes a 10MB block; the doc never shrinks back. Worst with paste-heavy workloads. GC requires global quiescence ("all peers have seen the delete"), which is impossible to prove in true P2P. Server-mediated CRDTs cheat by treating server as oracle. Yjs uses "snapshots" + structural sharing to mitigate, not eliminate.
Position identifiers over indices for text No transformation needed when inserting concurrently. Insert at position(A,B) survives any reorder. Identifiers grow logarithmically (LSEQ) or super-linearly (Logoot, RGA) with edit count. Long-lived documents accumulate identifier depth, slowing local operations from O(log n) toward O(n). YATA (Yjs) and Fugue (Loro) are the modern answer to this. RGA-style algorithms had a "stuck at the same position" interleaving bug that Fugue formally fixes. If you cite Yjs in an interview, know YATA.
Strong eventual consistency (SEC) If two replicas have seen the same set of ops, their state is identical. No coordination required. No linearizability, no transactions across keys, no "read your write" without local-only reads. Counter-style invariants ("balance >= 0") cannot be enforced. You can converge to negative balance. SEC is weaker than people think. Real systems pair CRDTs with a coordinator for invariant-critical ops, defeating the original purpose. Honest answer: CRDTs solve UI state, not money.
Per-data-type algorithm Set, counter, map, register, sequence, tree each have specialized provably-correct algorithms. Mixing types is hard. A "map of lists" is two CRDTs composed, and composition often loses guarantees. You add a "move list item" feature and discover none of the existing tree CRDTs handle it without cycles. The "moveable tree" problem was unsolved until Kleppmann's 2021 paper. Loro implements it; Automerge does too. If your app has hierarchical drag-and-drop, this is decisive.
Op-based delivery requires reliable causal broadcast Smaller payloads than state-based. Diff is just the new op. Must guarantee at-least-once + causal order. Lost op = divergence forever. WebRTC datachannel drops a packet; without app-level resend, replicas silently diverge. In practice, "op-based" CRDTs in production (Yjs, Automerge) are actually delta-state hybrids — they recover from missing ops by exchanging state vectors. Pure op-based is a textbook fiction.
No central server P2P sync via WebRTC, BLE, sneakernet. Offline-first becomes natural. Access control, persistence, presence, undo all become application-layer problems. "Undo my last edit" is local-only by default; collaborative undo is a research problem. Yjs has y-protocols for awareness/presence as separate concerns. Automerge punts entirely. Local-first manifesto is honest about this: convergence is solved; everything else around it is still hard.
Memory-resident operation log Snappy local edits, time-travel, undo all become free or near-free. Memory grows with edit count. Lazy loading is hard without breaking causal references. A 5-year-old Notion doc, opened on mobile, OOMs the WebView. Yjs sub-documents are the production answer. Treat the doc as a forest of small CRDTs and lazy-load. Anyone building a multi-MB CRDT and not using sub-docs has not hit prod yet.

Operational Transformation (OT)

Best when a central server can serialize document operations and preserving editor intent beats offline-first autonomy.

Server-mediated transformation: ops change shape as they cross other ops.

Trade-OffWhat You GainWhat You Give UpWhen It Bites YouPE Nuance
Transformation over commutativity Tight intent preservation. Concurrent "insert at 5" and "insert at 5" produce 2 inserts at distinct positions, not one collapsed. Each new operation type requires N transform functions against every other op type. Quadratic complexity. Adding a "table" op to a text editor doubles the transform matrix and breaks proofs of existing pairs. The TP2 property (convergence under arbitrary concurrent ops) is famously broken in most "OT" implementations. Google's Wave OT had a real TP2 bug for years. CRDTs avoid this entirely.
Central server for serialization Linear op log, easy persistence, easy access control, easy presence, easy invariants. Single point of failure for the doc. Offline is hard. Multi-region writes require careful design. Server goes down during a meeting; nobody can collaborate even if they're on the same LAN. "OT requires a central server" is an oversimplification — Jupiter and SOCT4 work P2P with TP2, but the complexity is brutal. Production OT is always star-topology. Google Docs proved this.
Linear history for free Every op has a server-assigned sequence number. Audit, replay, undo redo all trivial. Latency of every op = round trip to server. Optimistic local apply + reconciliation papers over this. High RTT clients (300ms+) see jitter as the server rewrites their local state on every reconcile. Google Docs hides this with an aggressive optimistic UI and a reconciliation queue. The hidden cost: client-side state machine for "in-flight ops" that has to be perfect, or characters duplicate / vanish.
Smaller wire payloads than CRDT state An "insert 'a' at position 5" is ~10 bytes. No vector clock, no position ID. Server-side history grows linearly with ops. Compaction requires care. A 10-year-old Google Doc with millions of ops needs server-side snapshots to load fast. OT wire format is the lightest of all approaches by far. This is why Google Docs scales to massive docs without client-side OOM. CRDTs can't match this without giving up offline.
Operation-typed schema Each op has explicit semantics (insert, delete, retain, format). Schema migrations are op-version changes. Evolving the op set is invasive. Old clients can't apply new ops; new clients can't transform against old ops without history. Releasing a new "comment" op type and finding clients pinned to old versions in mobile app stores. Forward-compatibility is the silent killer of OT systems. CRDT systems treat unknown ops as opaque blobs and forward; OT must transform, so unknown ops are fatal.
Inversion is straightforward Each op has an inverse (insert ↔ delete with same position). Undo is just apply the inverse. Collaborative undo requires transforming the inverse through subsequent ops, which can change meaning. User edits, peer edits in the same region, user undoes — undo now deletes the peer's content. Selective undo (undo my own op but not others) is the actual hard problem. Google Docs solved this; most OT implementations don't, and users notice immediately.
Proven for text and rich text 40+ years of research, multiple production systems, well-understood for sequence types. Extending to JSON, trees, sets, counters is mostly DIY and rarely as elegant as CRDT versions. You start with text OT and add a "comments thread" — now you're inventing OT primitives for tree-of-text. ShareDB's json0 is the canonical OT-for-JSON, and even its maintainers acknowledge it's complex. If your doc is anything but rich text, OT pays a heavy tax.
Tightly server-coupled persistence Persist the op log; replay rebuilds any state. Snapshots are an optimization, not a requirement. Storage scales with edit volume, not document size. Hot docs in viral threads become storage hot spots. A doc with 1M edits costs more to load than a 100MB blob even though the rendered text is 50KB. Google Docs runs periodic snapshot + truncate. This is operational, not algorithmic. Production OT is always "OT + snapshot store", a fact most OT papers omit.

Differential Sync

Choose when sync intervals and fuzzy patches are acceptable, and repairing drift is simpler than modeling every operation.

Diff/patch over fuzzy matches: collapses divergence, doesn't prevent it.

Trade-OffWhat You GainWhat You Give UpWhen It Bites YouPE Nuance
Stateless diff over op streams No causal metadata. Each sync round is a plain text diff + patch. No intent preservation. Concurrent inserts at the same offset merge into garbage. Two users typing different sentences at the cursor position simultaneously; result is character-level interleaved nonsense. This is why Google Wave moved to OT and Docs followed. Diff-sync is "best effort convergence", which is fine for prose drafts and a disaster for code.
Shadow copy per peer Diffs are computed locally against a known baseline. No global op log needed. 2x memory per peer (live + shadow). Hub model means O(N) shadows on server for N peers. A 100-user collab session needs 100 shadows on the hub; large docs OOM the server. Fraser's original paper acknowledges this; production deployments use shadow eviction with re-sync on reconnect. Nobody runs raw diff-sync at scale anymore.
Fuzzy patch application Patches apply even when context has shifted. Robust to mild divergence. "Best effort" matching can place edits in semantically wrong locations. Silent corruption. You meant to fix typo on line 5, your patch matches similar text on line 47. diff-match-patch's Bitap algorithm is genuinely brilliant for the use case (CodeMirror still uses it), but it's not a sync primitive. Reframe it as a fallback merge tool, not a sync engine.
Symmetric protocol Same algorithm on client and server. Easy to implement, easy to reason about. No notion of authoritative state. Conflicts resolve by "whoever syncs last wins per region". Mobile client offline for 2 days, syncs back, sees 50 conflict edits and silently picks one branch. The simplicity is seductive but deceptive. Symmetric protocols hide which side is authoritative, which is fine until you need to debug a divergence in prod and have no causal record.
Patch as transport-friendly format Diffs compress well. Easy to send over any transport, including non-realtime (email, polling). Larger than op-based for fine-grained edits. A single-char insert ships ~50 bytes of patch context. High-frequency keystroke sync becomes bandwidth-heavy compared to OT or Yjs ops. Diff-sync was designed for periodic sync (every 1-2s), not per-keystroke. Forcing it into realtime is a category error people make trying to extend it.
Best-effort convergence, not guaranteed In the common case (low concurrency, similar baselines), converges fast with minimal state. No formal guarantee. Pathological histories diverge silently or produce nonsense. Your QA scripts can't reproduce a divergence bug because it depends on packet timing. CRDTs and OT have formal proofs of convergence. Diff-sync has neither. For docs where corruption is annoying (notes), fine. For docs where corruption costs money (contracts), no.
Domain-agnostic via text representation Works on anything serializable to text. JSON, code, prose all share the same engine. Structural changes (move JSON key, reorder list) become character-level diffs and merge poorly. Two users reorder list items differently; merged result has duplicated and reordered items. This is why structural CRDTs (Automerge, Yjs Map) exist. Treating structured data as text is the path to corrupted JSON in your DB. Use diff-sync only on opaque text.
Convergence requires bidirectional flow Both sides exchange diffs each cycle. Symmetric and simple. Long offline periods break the shadow invariant; reconnection requires expensive full-doc reconciliation. Field worker offline for a week, server has evicted shadow, reconnect downloads full doc + 3-way merge. This is the unsolvable failure mode. CRDTs handle long offline natively because they don't depend on a shared baseline. Diff-sync is fundamentally an online-leaning algorithm.

Event Sourcing

A strong fit for audit/replay/time-travel domains where writes are immutable facts and projections own conflict behavior.

State is a fold. Conflict resolution lives in the projection.

Trade-OffWhat You GainWhat You Give UpWhen It Bites YouPE Nuance
Append-only log over mutable state Full audit trail, time travel, replayable projections, natural fit for Kafka/Kinesis/EventStore. Read path requires fold or materialized projection. Naive reads are O(events). A 5-year-old aggregate with 200K events takes 30s to rehydrate on first read. Snapshots + rolling projections are mandatory in prod, but they're an operational concern (when to snapshot, how to invalidate). Every event-sourced system reinvents this and most do it wrong the first time.
Schema evolution via event versioning Old events stay immutable; upcasters bring them forward. Strong forward compatibility. Upcasters accumulate. A 5-year system has chains of v1→v2→v3→v4 transforms for every event. You change a field name in v7, and v1 events have to upcast through six versions to render. Greg Young's rule: "events are immutable, the schema is forever". Most teams underestimate. Define a deprecation policy and a "weaver" pattern before you have 50 event types, not after.
Per-aggregate consistency boundary Strong consistency within an aggregate via single-writer pattern. Easy to reason about. Cross-aggregate transactions require sagas, process managers, or eventually consistent projections. You discover "user moves item from cart to wishlist" spans two aggregates and needs saga, mid-sprint. Aggregate boundaries are the most important design decision in event sourcing and the one that gets revisited 3 years in when invariants don't match. DDD-driven design is non-negotiable.
Conflict resolution at projection time Different projections can resolve the same events differently. "Last-write-wins view" vs "history view". No single source of truth for "current state". Projections drift, debugging requires log replay. Two services have different projections of the same stream and start producing contradictory outputs to downstream. This is genuinely powerful and genuinely dangerous. Production discipline: one canonical projection per consumer, with idempotency and exactly-once consumption (Kafka EOS or de-dup keys).
Idempotent event handlers Replays are safe. Catch-up consumers reach the same state as live ones. Handlers must be carefully designed. Side effects (emails, payments) require outbox or de-dup keys. "Replay this projection from event 0" sends 500K welcome emails because the email handler isn't idempotent. The outbox pattern is the standard answer for "side effects in event handlers". If you're event-sourcing without outbox + de-dup, you're one bad day away from a customer-facing incident.
Concurrency via optimistic versioning Expected version on append rejects stale writes. Mathematically clean. High-contention aggregates retry constantly. UX must handle "your version is stale, retry" without confusing users. Black Friday: 1000 concurrent writes to a single inventory aggregate, 95% retry rate, latency spikes. The mitigation is splitting the aggregate (per-SKU, per-warehouse) or moving to event-driven CQRS where commands are queued. Both are real architectural changes, not config.
Event log as integration contract External consumers tail the log. Decoupled, scalable, ordering preserved. Event schemas leak internal model to consumers. Refactoring is breaking the API. You rename an internal aggregate and discover three external teams have consumers reading the old event names. Public events ≠ internal events. Use the "event translator" pattern (Eric Evans, anticorruption layer) and never let internal events out to consumers. CDC tools (Debezium) make this mistake easy.
Storage cost over compute cost Cheap to write (sequential append), parallel-readable. Object storage friendly. Storage grows monotonically. Compaction is non-trivial because events are the source of truth. 3 years in, you have 10TB of events and reads through the projection are slow on cold storage. Kafka's tiered storage and EventStoreDB's scavenging address this, but each has gotchas. The PE move is to architect retention from day one: hot/warm/cold, what's compactable, what's regulatory-locked.

Three-Way Merge

Default for async branching when surfacing conflicts to humans is safer than silently inventing an automatic merge.

Common ancestor anchor: surface conflicts to humans, don't auto-resolve.

Trade-OffWhat You GainWhat You Give UpWhen It Bites YouPE Nuance
Common-ancestor anchoring Distinguishes "A changed, B didn't" from "both changed". No false conflicts. Requires storing or computing the merge base. Histories must be navigable (DAG). Diverging branches with no common ancestor (rebased out, cherry-picked) lose merge fidelity. This is the conceptual leap over diff-sync: knowing the base reveals intent. Git's recursive merge handles multiple bases. Without a base, you're back to diff-sync's guessing.
Human conflict resolution over auto-merge Semantic conflicts surface for human judgment. No silent corruption. Real-time UX is impossible. Workflow requires explicit merge step. Anything that needs sub-second collab (Figma, Docs) is the wrong fit. 3-way merge is async by design. Forcing it sync (continuous merge bots) defeats the whole point. Match the algorithm to the workflow: code review = async = 3-way merge.
Line-based or hunk-based diffing Fast, robust, language-agnostic. Decades of tooling. Whitespace, formatting, and structural changes generate spurious conflicts. One side reformats with prettier, the other adds a function; 100% conflict on every line touched. Structural (AST-aware) merge tools (mergiraf, semantic-merge) solve this but break on minor syntax variation. Most teams default to text-merge + lint-on-merge, which is a sane compromise.
DAG-based history model Branches and merges are first-class. Bisect, blame, revert all work. Storage and reasoning complexity. Beginners struggle with rebase vs merge. Onboarding cost. New engineers cause incidents with bad force-pushes. Git's UI is famously bad for the DAG underneath. Tools like Graphite, Jujutsu, Sapling are attempts to fix this without changing the model. The model itself is correct.
Two-sided commutativity, not three-sided A merged with B = B merged with A in most cases. Predictable. When more than two branches converge (octopus merge), behavior is undefined for conflicts. Octopus merge with conflicts: Git refuses, you serialize the merges, history becomes unintuitive. 3-way merge is fundamentally a pairwise op. Real "N-way merge" is a research topic (Darcs patch theory, Pijul). For most teams, sequential pairwise merges work fine and are easier to debug.
Reversibility via revert commits A change can be undone safely as a new event in history. Full audit. Revert ≠ rollback. History grows; revert of revert is fragile. You revert a problematic deploy, then realize part of it was correct; cherry-picking back is messy. git revert is fine for forward-rolling fixes. For "undo everything since X", you want force-push (lossy) or a new branch (creates parallel history). Choose explicitly, not by accident.
Content-addressable storage Deduplication via SHA. Tree object reuse across commits. Efficient at scale. Storage growth is bound by content, but metadata (commits, refs) still grows with edit count. Monorepos with 10M files and 1M commits stress even Git's CAS; LFS / partial clone are mandatory. Git scales surprisingly well, but only with Scalar/partial-clone/sparse-checkout. Anyone with a monorepo above ~100GB is operating Git as a database, not a VCS. Meta and Google moved to Mercurial/custom for this reason.
Explicit branching workflow Reviewable units of change. PR culture. Forces social negotiation of conflicts. Friction. Branch lifetime > 1 week = pain. Continuous integration is a discipline, not a tool. Long-lived feature branches → 3000-line merge with 200 conflicts → review skipped, bugs land. Trunk-based development is the answer. 3-way merge works best when used least. The PE move is to design the workflow so 3-way merge rarely needs to resolve real conflict.

2. Use Cases

Real workloads where each technology was chosen, why it won, and what ruled out the obvious alternative.

CRDTs

Use when offline edits and multi-master convergence matter more than strict global invariants or minimal metadata.

Use CaseCompany / ScenarioDriving PropertyScale DimensionWhy Not Alternative
Local-first note-taking with full offline syncAnytype, Obsidian Sync, Linear (CRDT-inspired sync engine)Zero-coordination offline edits that converge on reconnect, with multi-day offline tolerance100K+ users, multi-device per user, days of offline editsOT requires a server in the merge path; you can't make progress offline without one
Real-time collaborative rich text in editorsYjs in Notion AI Blocks, Discourse, JupyterLab, ProseMirror integrationsSub-100ms keystroke convergence across >50 concurrent editors per doc with no central authority requiredYjs handles docs with millions of ops, 100s of concurrent peers via y-websocketOT's central server becomes the bottleneck at 100+ peers per doc; CRDT scales via gossip
Design tool object-level state syncFigma (LWW-Register style, property-level)Atomic property updates where conflicts on different fields of the same object don't fightMillions of objects per file, 100+ concurrent collaborators on flagship filesCharacter-level CRDTs are overkill for "set fill color"; text CRDTs would balloon doc size 10x
P2P sync without a backendLocal-first apps using Automerge over WebRTC, Holepunch, BLE mesh appsTrue P2P convergence with no infrastructure: no server, no relay, no cost-per-user backendSmall teams <20 peers, mesh latency <500ms, doc sizes <10MBOT and diff-sync both require trusted central state; CRDT is the only mathematically-clean option
Multi-region active-active datastore conflict resolutionRiak (LWW + Riak DT), Redis CRDTs (Active-Active in Redis Enterprise), Cosmos DB multi-masterCross-region writes that converge without coordination round-trips at the storage layerGeo-distributed writes at 1M+ ops/sec across 3+ regionsPaxos/Raft cross-region adds 100ms+ latency; CRDTs converge async with no consensus call
Offline-tolerant field operations toolsSurvey apps, field service apps, healthcare EMR with intermittent connectivityPer-device durability with merge on reconnect, weeks of disconnection tolerated10K-100K field devices, 100+ MB local state, weekly sync windowsDiff-sync's shadow eviction breaks at multi-day offline; CRDT's metadata is the durable price paid for this

Operational Transformation (OT)

Best when a central server can serialize document operations and preserving editor intent beats offline-first autonomy.

Use CaseCompany / ScenarioDriving PropertyScale DimensionWhy Not Alternative
Cloud-first collaborative text editingGoogle Docs, Etherpad, ShareDB-based appsTight intent preservation for character-level concurrent edits with minimal wire payloadDocs with 10M+ ops, 500+ concurrent editors per doc demonstrated by Google in interviewsCRDT metadata per character would bloat docs that already strain storage; OT op log is leaner
Centralized collaborative code editingReplit Multiplayer, CodeMirror collab via ShareDB, CodeshareServer-authoritative state for integration with run/build pipelines that need a canonical version1K+ rooms, 10s of editors per room, sub-200ms keystroke latencyCRDTs work but adding compile/run integration is easier with a server-side authoritative state already
Spreadsheet collaboration with formula recalcGoogle Sheets (OT-based for cell content)Server-authoritative recalc on dependent cells; deterministic computation order10K cells per sheet, dozens of concurrent editors, formula chains of 1K+ depsCRDT for cell values is fine, but formula evaluation order needs serialization; OT gives you that for free
Real-time presentation / whiteboard editing (text-heavy)Google Slides, Etherpad whiteboards with text-first modelIntent preservation on text within shapes, server-mediated sync for <100 collaborators100s of slides per deck, 50 concurrent editors, mixed text/object operationsFor pure text fields, OT's tighter intent preservation beats LWW + character CRDT hybrid
Enterprise wiki / structured doc collabAtlassian Confluence (Synchrony, OT-based)Server-authoritative version with audit trail and compliance hooks; central locking when neededEnterprise scale: 100K+ docs, regulatory audit requirementsCRDTs don't naturally support enterprise lock-down (force checkout, supervised edits); OT's centralization fits

Differential Sync

Choose when sync intervals and fuzzy patches are acceptable, and repairing drift is simpler than modeling every operation.

Use CaseCompany / ScenarioDriving PropertyScale DimensionWhy Not Alternative
Original Google Wave / early Google DocsGoogle Wave (deprecated), early Docs prototypeSimple stateless sync over high-latency connections; minimal client implementationUsed briefly at Google scale before migration to OT for correctness reasonsOT existed but felt heavyweight in 2006; diff-sync was a deliberate "ship now, fix later" choice
Periodic sync for collaborative draft documentsLightweight CMSs, draft markdown sync in static site generators1-5 second periodic reconciliation tolerable; per-keystroke not needed10-100 users per doc, sync intervals 1-10sOT is overkill for periodic sync; CRDT metadata is wasteful when no offline needed
Fuzzy patch application in code review toolsdiff-match-patch in CodeMirror, Phabricator, suggested-edit featuresPatches must apply against shifted context (lines moved since suggestion was made)10s-100s of suggestions per review, source files of 1K-10K linesThis is diff-sync repurposed correctly: as a one-shot merge tool, not a sync engine
Browser-extension form sync across devicesForm-filler extensions, password manager autofill state syncLow-frequency, last-edit-wins semantics with stale-baseline tolerance10s of devices per user, sync intervals minutes, doc sizes <1MBCRDTs are overkill for "sync this form occasionally"; diff-sync's simplicity wins
Markdown blog post drafts across web + mobileIndie blogging tools, draft sync in note apps without true real-time collabSingle-user multi-device, "last device wins" with conflict surfacing on collision1 user, 2-5 devices, sub-100KB documentsSingle-user case doesn't need OT or CRDT; diff-sync ships in 200 lines of code

Event Sourcing

A strong fit for audit/replay/time-travel domains where writes are immutable facts and projections own conflict behavior.

Use CaseCompany / ScenarioDriving PropertyScale DimensionWhy Not Alternative
Financial ledgers with regulatory auditBanks, fintech (Plaid-style transaction logs), trading systemsImmutable event log is the regulatory artifact; state is derivedBillions of events, multi-year retention, ms-grade replay for auditsCRUD destroys audit trail; CRDT's tombstones violate "no edits to past" regulatory requirements
Inventory and order systems with eventual consistencyAmazon retail, Shopify, large e-commerce platformsDecoupled aggregates (cart, inventory, payment) with async projections to fulfillment, search, recommendations1M+ orders/day, 100s of downstream projections, multi-region replicationSynchronous transactional CRUD across these domains creates a distributed monolith; event-driven decouples them
IoT / telemetry pipelines with replayable analyticsConnected vehicle data, industrial sensor fleets, observability backendsHigh-write throughput append-only, with downstream consumers tailing different projections10M+ events/sec, multi-day retention, 100s of consumer groupsState-only stores lose time-series fidelity; event sourcing IS the time series
Domain-driven aggregates with complex invariantsInsurance policy systems, loan origination, multi-step approval workflowsEach event carries business intent (PolicyAmended, ClaimDenied) for downstream rule enginesAggregates with 100s of event types, 10-year policy lifetimesState-only models lose the "why" of each transition; event sourcing preserves it as first-class data
CQRS read-side materialization for varied query shapesMicroservices teams with one write model + many read modelsEach consumer builds its own projection optimized for its query patterns50+ downstream projections per event stream, varied DBs (ES, Postgres, ClickHouse)Single shared DB schema becomes the choke point; ES + CQRS lets each team own its read shape

Three-Way Merge

Default for async branching when surfacing conflicts to humans is safer than silently inventing an automatic merge.

Use CaseCompany / ScenarioDriving PropertyScale DimensionWhy Not Alternative
Source code version controlGit, Mercurial, Jujutsu, used industry-wideBranched async workflow with explicit conflict surfacing to humans for semantic reviewLinux kernel: 1M+ commits, 70K+ contributors, 30M lines of codeAuto-merge would silently corrupt code; conflict surfacing IS the safety property
Configuration as code with PR-based reviewsTerraform, Kubernetes manifests, GitOps tooling (ArgoCD, Flux)Reviewable, revertable infra changes with branch-based environments10K+ resources per repo, 100s of engineers, multiple deploy targetsReal-time config editing risks production outage; async merge with review is the safety net
Schema migrations across long-lived branchesDatabase migration tools (Flyway, Liquibase), Prisma migrations in feature branchesMigrations are ordered changes; merge surfaces ordering conflicts to humans for resolution100s of migrations per repo over years, multiple parallel feature branchesAuto-resolve would apply migrations in unsafe orders; human review is mandatory
Wiki / docs with editorial workflowMediaWiki edit conflicts, ReadTheDocs PR-based docs, Astro StarlightEditorial review with conflict surface; "last save wins" is unacceptableWikipedia: millions of articles, peak edit collision rates manageable via 3-way mergeDiff-sync silently merges; OT/CRDT are overkill for slow editorial cadence
Forked / re-distributable content pipelinesOpen source library forks, dataset versioning (DVC, lakeFS), spec forksLong-lived divergent branches must rejoin upstream; ancestor-based merge is the workflow1000s of forks, multi-year divergence, periodic upstream mergesNo real-time component; 3-way merge is the right semantic match for the workflow

3. Limitations

Matrix view: rows are limitation categories, columns are technologies. Toggle columns to focus.

Limitation Category CRDTs OT Diff Sync Event Sourcing 3-Way Merge
Metadata overhead HIGH Position IDs, vector clocks, tombstones. Yjs ~1.5x, Automerge historically 5-10x. MEDIUM Op log grows with edits; snapshots required for prod. MEDIUM 2x memory per peer for shadows; ephemeral diffs. HIGH Storage grows linearly with events; snapshots mandatory. MEDIUM History DAG metadata grows; CAS dedupes content.
Cross-key transactions CRITICAL No global atomic ops. SEC only. MEDIUM Server can serialize cross-key ops if modeled in op set. CRITICAL No transactional concept at all. HIGH Within aggregate yes; across aggregates needs sagas. MEDIUM Atomic per commit; cross-commit needs application logic.
Schema evolution MEDIUM Unknown ops can be forwarded as opaque; type changes are hard. HIGH Op set is part of the contract; new ops break old clients. MEDIUM Format-agnostic for text; structured changes break merge. HIGH Upcasters required forever; chains accumulate. MEDIUM Content-only; schema lives outside, must coordinate separately.
Invariant enforcement CRITICAL Counters can go negative. No global constraint check. MEDIUM Server is single writer per doc; can enforce invariants. CRITICAL Last-sync wins; no global invariant. MEDIUM Per-aggregate optimistic concurrency enforces invariants. MEDIUM Pre-receive hooks / CI gates enforce invariants async.
Real-time presence / awareness MEDIUM Separate awareness protocol needed (y-protocols). Not part of CRDT. MEDIUM Server can broadcast presence cheaply; common in production. MEDIUM Often layered separately via hub messages. HIGH Not the right tool; presence is ephemeral state, ES is durable. CRITICAL Async by design; presence is meaningless.
Access control granularity HIGH P2P CRDTs can't enforce ACLs cryptographically without extra layer. MEDIUM Server enforces ACLs per op; standard pattern. MEDIUM Hub-mediated, can apply ACLs at the hub. MEDIUM Per-command auth in command handler; well-established. MEDIUM Repo/branch-level ACLs; standard in Git hosting.
Undo/redo complexity HIGH Selective collaborative undo is a research problem. MEDIUM Inverse-op transforms work; Google Docs proves it scales. HIGH No op model; undo = re-sync to prior shadow. MEDIUM Compensating events (CartItemRemoved); natural fit. MEDIUM git revert / reset; well-understood.
Server cost at scale LOW Stateless relay possible; cheap at scale. HIGH Server in critical path for every op; scales as document count × concurrency. HIGH O(N) shadows per doc on hub; memory pressure at scale. MEDIUM Append throughput is cheap; projection compute scales with read patterns. MEDIUM Storage scales with content + history; CAS amortizes.
Garbage collection CRITICAL Tombstones accumulate; safe GC needs global quiescence proof. MEDIUM Snapshot + truncate; well-understood. LOW Stateless on the wire; minimal GC concern. HIGH Event compaction breaks audit; regulatory considerations. MEDIUM Reflog expiry; periodic gc; standard tooling.

4. Fault Tolerance

How each technology behaves when networks partition, nodes crash, or replicas drift. Rows are dimensions, columns are technologies.

Dimension CRDTs OT Diff Sync Event Sourcing 3-Way Merge
Replication model Multi-master, leaderless. Every replica equal. Star topology with single authoritative server; clients are caches. Hub-and-spoke. Clients sync against hub or each other pairwise. Log-replicated (Kafka/EventStore). Single-writer per partition; many readers. Distributed VCS; full replicas (Git) or central with checkouts (SVN).
Failure detection App-layer; CRDT itself has no notion. Awareness protocols use heartbeats. Server health checks + client reconnect. Server is the SPOF that gets monitored. Hub heartbeat; client detects via missed sync intervals. Log infrastructure handles it (Kafka ISR, EventStore cluster heartbeats). N/A on the algorithm; remote unavailability detected by transport.
Failover mechanism No failover needed; any peer can serve. Reconcile state on rejoin. Hot-standby server takes over op log; clients reconnect to new endpoint. Hub failover; clients drop shadow and re-sync against new hub. Log replicas elect new leader (Raft / KRaft); reads continue from followers. Repo mirroring; switch remote URL. No automatic failover concept.
RTO (typical) Effectively 0; offline-capable. Reconciliation is async. 10s-60s for server cutover; longer if op log recovery needed. 10s-60s for hub recovery; longer for re-shadow on all clients. Sub-30s for log infra; minutes-to-hours to rebuild projections from log. Manual; depends on mirror setup. Often minutes-to-hours.
RPO (typical) Zero for any persisted replica; ops in transit may be lost without app-level guarantees. Zero if op log is synchronously persisted; seconds if async. Seconds to minutes; depends on sync interval. Zero for committed events (sync replication); typically configurable. Zero for pushed commits; unpushed work is local-only loss.
Split-brain behavior Both sides accept writes; converge on merge. This is by design. Single-server model prevents split-brain; can lose availability instead. Both sides drift; on rejoin, "last patch wins per region" risks corruption. Quorum prevents split-brain in log; minority partition rejects writes. Both branches accept commits; merge surfaces conflicts to humans.
Blast radius of single-node failure Minimal; one peer offline doesn't affect others. New peer rebuilds from any peer. Entire document unavailable if no failover; single doc per server typical. All clients of that hub lose sync until failover. Single broker fails: partitions move to ISR; brief read/write degradation. A remote going down blocks push/pull but local work continues.
Cross-region failover story Native; multi-region multi-master is the use case. No coordination cost. Hard; requires global op ordering or per-region docs. Google Docs is largely single-region per doc. Per-region hubs with periodic cross-region sync; convergence is best-effort. MirrorMaker / Kafka federation; lag-tolerant DR; canonical pattern. Mirror clones; manual promotion. Operationally simple.
Data loss scenarios Ops lost in transit without app-level resend = silent divergence. State vectors detect on rejoin. Server crash before op log fsync = lost ops; client retry mitigates. Patches lost = silent divergence. Resync on next cycle, but conflicts may slip through. Events committed = durable. Producer crash between commit phases = duplicate (idempotency handles). Unpushed commits on dead laptop = total loss. Force-push overwriting = recoverable from reflog 30 days.

5. Sharding (Scaling the Collaborative State)

These algorithms aren't databases, so "sharding" here means how the collaborative state is partitioned across servers, peers, or storage to scale beyond a single node.

Dimension CRDTs OT Diff Sync Event Sourcing 3-Way Merge
Sharding model Document-level partitioning. Yjs sub-documents split one doc across CRDT instances. Document-per-server-instance; consistent hash on doc ID across server fleet. Document-per-hub; route by doc ID. Partition by aggregate ID (consistent hash); strict per-partition ordering. Repository as the shard; monorepo vs polyrepo is the partitioning decision.
Shard key constraints Sub-doc boundaries must align with merge boundaries; cross-subdoc ops are app-level. All ops to one doc must hit the same server; sticky sessions required. All peers of a doc must hit same hub; sticky routing. Aggregate ID drives partition; cross-aggregate ordering not preserved. Boundary = repo. Cross-repo dependency tracking is a separate problem (Bazel, build systems).
Rebalancing mechanism Sub-doc migration is data-plane; clients re-attach to new endpoint. Move doc to new server: drain ops, transfer state, re-route clients. ~seconds of unavailability. Drain hub, transfer shadows or force re-shadow on clients. Kafka partition reassignment; transparent to producers, brief lag for consumers. Repo splits via filter-branch / repo surgery. Major undertaking.
Rebalancing cost / impact Low if subdocs are independent; high if cross-subdoc refs exist. Brief read-only window; client reconnect storm on cutover. Shadows must be re-established; bandwidth spike on transition. Catch-up read I/O on new owner; producer lag during reassignment. High; repo splits change history, can break downstream automation.
Hot-shard behavior A viral doc still bottlenecks on the server hosting its websocket relay; subdivide via sub-docs. Hot doc maxes out a single server; "shard the doc" via OT sub-docs (rare, complex). Hot doc maxes out hub; shadow count × doc size = OOM risk. Hot aggregate becomes hot partition; split aggregate (per-SKU, per-warehouse). Hot repo (monorepo) hits Git's scaling limits; partial clone / sparse checkout required.
Maximum shards (practical) Tens of thousands of sub-docs per session, limited by client memory. Server fleet size; sticky routing limits concurrency. Hub count; small-medium scale (10s of hubs). Kafka: 200K partitions per cluster as a practical ceiling. Effectively unlimited repos; monorepo size capped by tooling (LFS, Scalar).
Resharding without downtime? Partial; sub-doc split is online but cross-references break briefly. No; doc migration requires brief op suspension. No; shadow re-establishment requires sync pause. Yes; Kafka reassignment is online. KRaft makes it smoother. Repo split is offline-style; large monorepos take hours.
Cross-shard query support N/A — no query layer. App-layer joins across sub-docs. N/A — OT is per-doc only. N/A — same. Cross-aggregate via projections; no native cross-partition transactions. Cross-repo deps via submodules (painful) or build systems (Bazel, Buck).

6. Replication

How replicas synchronize state. This is the core algorithmic question for these technologies.

Dimension CRDTs OT Diff Sync Event Sourcing 3-Way Merge
Replication topology Leaderless multi-master. Any peer can write. Single-leader (server). Followers (clients) propose, leader serializes. Hub-and-spoke or symmetric pair. No global leader. Single-writer per partition; many readers (CQRS). Distributed; every clone is a replica. Push/pull driven.
Sync vs async Async by design; ops propagate via gossip / pub-sub. Optimistic local apply (sync UX); server reconciliation is async. Async, periodic. Sync intervals 0.5-5s typical. Configurable: sync replication for durability, async for latency. Manual; user-driven push/pull.
Replication factor (default / max) N peers, no fixed count; any subset of peers carry state. Usually 1 (server) + clients; HA pairs the server. 1 hub + N peers; HA via hub pairs. 3-5 replicas typical for Kafka; configurable per topic. Unbounded; every clone is a replica.
Consistency level options Strong eventual consistency. No linearizability. Linearizable per doc via server serialization. Best-effort eventual; not guaranteed convergence under adversarial timing. Strong per-partition; eventual across partitions. Per-commit linearizable on a branch; cross-branch is human-resolved.
Replication lag (typical) Sub-100ms LAN, 100-500ms WAN. Bounded by network only. RTT + server processing; 50-200ms typical. One sync interval; 0.5-5s. Single-digit ms within cluster; seconds cross-region. Human-driven; minutes to days between pushes.
Conflict resolution Built into the data type (LWW-register, OR-set, RGA, Fugue, YATA). Deterministic. Transform functions per op-pair; produces single linear history. Fuzzy patch application; "best effort". No formal guarantee. Optimistic concurrency on expected version; retry on conflict. Surface to human via conflict markers; explicit resolution required.
Cross-region replication Native; multi-region multi-master is the design point. Hard; either per-region docs or global op log (latency expensive). Per-region hubs with cross-region peer sync; converges over rounds. MirrorMaker 2 / Cluster Linking; canonical pattern for Kafka. Mirror remotes; lazy sync. Operationally trivial.
Replication during partition Both sides accept; converge on heal. AP in CAP terms. Minority partition rejects writes (no server); CP-leaning. Both sides accept locally; conflicts on rejoin. Minority side rejects (quorum); majority continues. Both sides commit locally; merge on rejoin.

7. Better Usage Patterns

Patterns most teams miss. Anti-patterns surface in code review. Optimizations compound at scale.

CRDTs

Use when offline edits and multi-master convergence matter more than strict global invariants or minimal metadata.

PatternWhat Most Teams Do WrongThe Better WayWhy It Matters
Sub-documents for large stateBuild one giant CRDT doc that holds the entire app state; OOM on mobile after 6 months.Decompose into Yjs sub-docs (per page, per pane, per panel) and lazy-load on demand.Doc lifetime is years. Without sub-docs, every user pays the full history cost on every device.
Awareness as a separate protocolStuff presence, cursors, selections into the CRDT itself; explode the op log with ephemeral churn.Use y-awareness or equivalent ephemeral channel; CRDT is for durable state only.Cursor movements are 60Hz; storing them as CRDT ops bloats the doc 1000x and breaks sync perf.
State vector exchange before op replayReplay every op since the beginning of time on every reconnect.Exchange state vectors first; transmit only the delta.This is the difference between "reconnect in 100ms" and "reconnect in 30s" on long-lived docs.
Treat undo/redo as user-scopedBuild collaborative undo with shared undo stack; users undo each other's edits.Per-user undo manager (Yjs UndoManager scope) that only undoes ops attributed to that user.The single most-complained-about UX defect in homegrown CRDT editors; Yjs handles it correctly out of the box.
Schema for the CRDT, not for the appUse CRDT primitives directly in UI code; refactoring the schema breaks every component.Build a thin domain layer over the CRDT; UI reads from observable views, not raw Y.Map.Decoupling lets you migrate from Yjs to Loro (or vice versa) without rewriting the UI.
Persist op log, not just snapshotsSnapshot the CRDT every N seconds and discard the op log; lose offline-edit reconciliation.Persist both: op log for reconciliation, snapshots for fast load.A new client joining after a week needs the ops, not the snapshot; snapshot is a load optimization, not a sync mechanism.

Operational Transformation (OT)

Best when a central server can serialize document operations and preserving editor intent beats offline-first autonomy.

PatternWhat Most Teams Do WrongThe Better WayWhy It Matters
Server as op orderer, not state holderServer holds full document state and broadcasts diffs; becomes the choke point for memory.Server holds the op log + lightweight checkpoint; clients reconstruct view.Op-log servers scale to 10x more docs per box than state-holding servers.
Explicit op versioning from day oneAdd new op types without version bumps; old clients crash on new ops.Embed op version in the op envelope; reject incompatible versions with explicit "upgrade required".Mobile clients pinned to old versions for months are a fact of life; design for it or lose them silently.
Snapshot + truncate cadenceLet the op log grow forever; doc load time hits seconds, then tens of seconds.Periodic snapshot to S3/blob + truncate. Snapshot interval tied to op count, not wall time.10M-op docs are common at scale (Google Docs). Without snapshot policy, you lose load-time SLOs.
Per-doc backpressureLet one viral doc with 1000 collaborators DDoS the server with op storms.Rate-limit ops per client per doc; batch sub-100ms ops; throttle to server capacity.Without this, a single doc takes down the entire server fleet via global queue saturation.
TP2 testing via fuzz harnessTrust the textbook proofs; ship; hit TP2 violations in prod with rare op sequences.Property-based test (Hypothesis, fast-check): generate random op sequences, verify convergence on all permutations.Google Wave's TP2 bug existed for years. Property tests catch it in CI.
Optimistic UI with reconciliation queueApply local op, wait for ack, then render; UX is laggy on high-RTT connections.Apply local op immediately; queue pending ops; on ack, reconcile any server-side transforms.Google Docs feels native because of this; absence of this pattern makes OT systems feel laggy by default.

Differential Sync

Choose when sync intervals and fuzzy patches are acceptable, and repairing drift is simpler than modeling every operation.

PatternWhat Most Teams Do WrongThe Better WayWhy It Matters
Use diff-sync only for opaque textApply diff-sync to structured JSON; structural changes corrupt the data on merge.Use diff-sync only on truly opaque text streams; use CRDT/OT for structured data.This is the most common diff-sync mistake; it's a category error that corrupts production data.
Bound sync intervals to UX cadenceTry to make diff-sync feel real-time with 100ms intervals; saturate bandwidth and hub.Use 1-5s intervals; pair with optimistic local UI so it feels instant.Diff-sync at 100ms intervals defeats its own simplicity advantage; you're paying for OT-like overhead without OT's guarantees.
Shadow eviction policyKeep shadows for every peer forever; hub memory grows unboundedly.LRU evict shadows on inactivity; force full re-sync on reconnect of long-idle peers.Without eviction, hub OOMs on long-tail user base.
Use diff-match-patch for client-side merge onlyBuild a sync engine on diff-match-patch and run it for real-time collab; corruption ensues.Use diff-match-patch as a one-shot merge tool when integrating user-submitted patches with shifted context.diff-match-patch is brilliant at its real job (CodeMirror suggestion application); use it there, not as a sync engine.
Detect divergence with checksumsTrust that sync converges; users report "my doc looks different on my phone" with no detection.Exchange Merkle / rolling checksums of shadow + live; force full re-sync on mismatch.Diff-sync silently diverges; without checksums, you find out from support tickets.

Event Sourcing

A strong fit for audit/replay/time-travel domains where writes are immutable facts and projections own conflict behavior.

PatternWhat Most Teams Do WrongThe Better WayWhy It Matters
Outbox pattern for side effectsSend emails / call APIs directly from event handlers; re-runs duplicate them.Write side-effect intent to an outbox table in the same transaction; separate worker processes idempotently.This is the single most important pattern in ES. Without it, every re-run risks customer-facing duplicate actions.
Snapshot strategy from day oneDefer snapshots until aggregates have 10K+ events and load is broken; emergency retrofitting.Snapshot every N events (commonly 100-500); store with strict version compatibility.Retrofitting snapshots into a running system requires coordinated migration; design it in upfront.
Public vs internal event splitPublish internal events to external consumers via CDC; renames break consumers.Translate internal events into a stable public event contract; never expose internal aggregates directly.Refactoring your domain model becomes a public API break otherwise. Anti-corruption layer is mandatory.
Idempotency keys on commandsRe-submission of a command (network retry, replay) processes it twice.Every command carries a unique idempotency key; command handler de-dupes before applying.Without this, "submit order" duplicated under network blip becomes a duplicate charge; outbox + idempotency together are the safe pattern.
Versioned event schemas + upcastersModify old events in place to fit new schema; break audit and replay.Old events stay immutable; upcasters transform v1→v2→v3 on read.Events are forever. Any in-place mutation violates the core promise of event sourcing.
Aggregate sizing for contentionModel "Order" as a single aggregate including all items; black-friday writes contend on one stream.Split into OrderHeader + OrderItem aggregates; per-item ops don't contend with header ops.This is the most common ES scaling failure; revisit aggregate boundaries early when you see retry storms.

Three-Way Merge

Default for async branching when surfacing conflicts to humans is safer than silently inventing an automatic merge.

PatternWhat Most Teams Do WrongThe Better WayWhy It Matters
Trunk-based developmentLong-lived feature branches; merge becomes a multi-day archaeology project.Branches live hours-to-days; merge frequently into trunk; feature flags hide incomplete work.3-way merge works best when used least; the right answer is to minimize divergence.
Conflict resolution via review, not gutEngineer resolves merge conflict solo; ships subtle semantic bug.Conflicts require a second reviewer; treat merge resolution as a reviewable change.Merge conflicts are the highest-bug-density events in software. Treat them with corresponding rigor.
Pre-commit auto-formattingWhitespace and import-order conflicts dominate merge conflicts; signal-to-noise destroyed.Enforce formatter on every commit; conflicts then only on semantic changes.Without auto-format, 80% of merge conflicts are spurious; signal lost in noise.
Rerere for repeated conflictsResolve the same conflict on a rebase 10 times during a long-lived branch.Enable git rerere; resolutions are cached and replayed automatically.Hidden Git feature that saves hours on rebase-heavy workflows; almost nobody uses it.
Squash vs merge policy by branch typeOne global policy; lose history granularity or pollute trunk with WIP commits.Squash feature branches into trunk; merge commits for release branches to preserve history.Distinguishing "workflow commits" from "history commits" matters at scale; treat squash as a UX choice.

8. Advanced / Next-Gen Alternatives

Emerging successors, adjacent technology that does it better for specific cases, and patterns that obviate the original need.

CRDTs

Use when offline edits and multi-master convergence matter more than strict global invariants or minimal metadata.

Successor / AlternativeWhat It ImprovesMaturityMigration CostWhen To Consider
Loro (Fugue text CRDT)Eliminates RGA interleaving anomalies in concurrent text; better perf than Yjs on some benchmarks.EMERGINGHigh; different API surface, encoding format incompatible with Yjs.When you're starting fresh and want best-in-class text + tree CRDT, willing to bet on pre-1.0 maturity.
Automerge 2 (Rust core)5-100x perf improvement over Automerge 1.x; closes the gap with Yjs on size and speed.PRODUCTIONMedium; v1→v2 has migration path but data format changes.When you need JSON-document semantics (vs text-first) and were burned by Automerge 1.x perf.
ElectricSQL / Replicache (sync layers atop Postgres)Provides offline-first sync semantics with SQL queryability; CRDT under the hood, relational on top.EMERGINGLow if greenfield; high if migrating from custom CRDT.When you want CRDT benefits but also need to query the state with SQL (analytics, search).
Local-first sync engines (Liveblocks, Triplit, Jazz)Managed CRDT-backed services; hosted relays, presence, persistence, auth.EMERGINGMedium; integration work but no algorithm-level migration.When you don't want to run Yjs y-websocket servers and need presence/auth out of the box.

Operational Transformation (OT)

Best when a central server can serialize document operations and preserving editor intent beats offline-first autonomy.

Successor / AlternativeWhat It ImprovesMaturityMigration CostWhen To Consider
CRDT (server-mediated)Drops the TP2 problem; arbitrary concurrency converges by construction.PRODUCTIONHigh; op model is fundamentally different.When your OT system hits TP2 bugs in prod or you want to add offline support.
ShareDB / Json0 evolutionModern OT-for-JSON with active maintenance; addresses most "OT for non-text" pain.PRODUCTIONLow if already on OT; medium from scratch.When you need OT specifically (server-authoritative, tight intent) but for structured docs.
Hybrid OT + CRDTUse OT for character-level text inside fields, CRDT-style LWW for properties (Figma pattern).PRODUCTIONHigh; requires re-architecting around dual conflict model.When your domain has both rich text (needs OT) and structured objects (suits LWW); design-tool category.

Differential Sync

Choose when sync intervals and fuzzy patches are acceptable, and repairing drift is simpler than modeling every operation.

Successor / AlternativeWhat It ImprovesMaturityMigration CostWhen To Consider
Yjs / Automerge (CRDT replacement)Adds formal convergence guarantees; offline support; structural awareness.PRODUCTIONMedium; replace sync layer, keep UI; persistence schema changes.When silent divergence is unacceptable and you need real concurrency guarantees.
ShareDB (OT replacement)Tight intent preservation, no fuzzy patch matching; server-mediated.PRODUCTIONHigh; op-based model is fundamentally different.When your domain is text-heavy and silent merges produce visible bugs.
Operational sync engines (Replicache, Liveblocks)Provides correctness guarantees with diff-sync-like simplicity.EMERGINGMedium; framework adoption.When the diff-sync simplicity was the appeal but the correctness wasn't sufficient.

Event Sourcing

A strong fit for audit/replay/time-travel domains where writes are immutable facts and projections own conflict behavior.

Successor / AlternativeWhat It ImprovesMaturityMigration CostWhen To Consider
CQRS without event sourcingSeparate read/write models without the immutable-log commitment; lower operational burden.PRODUCTIONLow conceptually; high if already invested in event store.When you need read-model flexibility but not audit/replay; common middle ground.
Change Data Capture (Debezium, AWS DMS)Get event-log benefits from a traditional database without app-level event modeling.PRODUCTIONLow; bolt onto existing CRUD systems.When you want downstream stream integration without the design overhead of event sourcing; pragmatic choice for legacy.
Streaming databases (Materialize, RisingWave)Built-in materialized views over event streams; projections are SQL, not application code.EMERGINGMedium; integration work, paradigm shift for query teams.When your projections are many and varied; streaming SQL beats N hand-coded projections.
Temporal / Restate / DBOS (durable execution)Event sourcing for workflow state without the bespoke aggregate plumbing.PRODUCTIONMedium; rethink workflow as functions.When the event-sourced subsystem is mostly workflow / saga orchestration; let the framework own the log.

Three-Way Merge

Default for async branching when surfacing conflicts to humans is safer than silently inventing an automatic merge.

Successor / AlternativeWhat It ImprovesMaturityMigration CostWhen To Consider
Jujutsu (jj)Cleaner conflict model; conflicts as first-class values that can be operated on; better rebase UX.EMERGINGLow; jj is Git-compatible; coexists with existing repo.When team UX with Git rebase is causing actual productivity loss; jj is the modern alternative.
Pijul (patch theory)Commutative patches; merges are mathematically associative; no merge conflicts from reordering.RESEARCHHigh; entire paradigm shift, tooling immature.Watch but don't bet. The math is beautiful; ecosystem is too small for production use today.
Sapling (Meta)Monorepo scale; partial clone and lazy fetch by default; Phabricator-style stacked diffs.PRODUCTIONMedium; Git-compatible mode exists; native mode requires migration.When Git scale at monorepo level is the bottleneck (10M+ files, 10TB+ history).
Structural / AST-aware merge (mergiraf, GumTree)Reduces spurious conflicts from whitespace/formatting; merges by syntax tree, not text lines.EMERGINGLow; bolt onto existing Git via mergetool.When team productivity is bleeding to formatter-vs-logic merge conflicts; high ROI bolt-on.
CRDT-based VCS (DVCS with no conflicts)Eliminates conflict resolution entirely; trades human intent surfacing for auto-merge.RESEARCHHigh; paradigm shift.When the conflict-surfacing property is genuinely not needed (rare for code); consider for prose / config.

PE-level trade-off analysis. Compiled from production reports, post-mortems, and primary research (YATA, Fugue, Jupiter OT, diff-match-patch, Greg Young on Event Sourcing). As of 2026-06-01.

Quality bar: every cell answers an L7 review. If a cell reads as marketing, it's a bug.