Network & API Protocols: PE Trade-Off Analysis

Twelve protocols across four layers. The key insight before any table: these do not all compete. TCP and gRPC are not alternatives, they stack. Compare within a layer, choose across layers.

CATEGORY SWEEP / LAYERED

As of 2026-05-22

PE Verdict

Default to TCP + TLS 1.3 + HTTP/2 as the substrate, REST for public APIs, gRPC for internal service-to-service, and SSE for server push. Reach past these only when you can name the inflection point: REST to GraphQL when client-driven over-fetching dominates your latency budget, SSE to WebSocket when the client must push mid-stream, HTTP/2 to HTTP/3 when lossy mobile networks make TCP head-of-line blocking your tail-latency villain, and WebSocket to WebTransport only when you need unreliable datagrams and control UDP egress end to end.

Layer framing: Transport (TCP, UDP) carries bytes. Security (TLS) wraps the transport. App-transport (HTTP/1.1, HTTP/2, HTTPS) frames bytes into messages. API-style (REST, gRPC, GraphQL, WebSocket, SSE, WebRTC) models the actual contract. Matrix tables below are grouped by layer so no comparison forces non-peers into the same columns.

The Layer Map

Read this first. Most protocol confusion comes from comparing across layers. "REST vs TCP" is a category error.

API STYLE / SEMANTICSREST gRPC GraphQL WebSocket SSE WebRTC ← where your real design choices live

APP TRANSPORTHTTP/1.1 HTTP/2 HTTPS (+ HTTP/3) ← frames bytes into requests/messages

SECURITYTLS ← encrypts and authenticates whatever rides on it. HTTPS = HTTP + TLS

TRANSPORTTCP UDP ← move bytes host to host

gRPC runs over HTTP/2. GraphQL and REST usually run over HTTP. HTTP/3 runs over QUIC, which runs over UDP. WebRTC and WebTransport ride UDP. Everything reliable ultimately rides TCP unless it deliberately opts into UDP.

Best default choices

RESTPublic APIs, CRUD, cacheable reads, broad clients gRPCTyped internal service calls and streaming SSEOne-way browser streams and LLM output WebSocketBidirectional realtime sessions HTTP/2 + TLSModern default substrate for API traffic

1. Trade-Offs

One table per protocol. A trade-off is giving up X to get Y. Click a header to sort. Grouped by layer.

Transport TCP & UDP

TCP

Use as the reliable byte-stream substrate when ordered delivery, congestion control, and broad middlebox compatibility matter more than loss-tolerant latency.

Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
In-order reliable delivery	Bytes arrive intact and ordered, no app-level reassembly	Latency floor from retransmit + ack cycles	Lossy mobile/satellite links where one drop stalls everything queued behind it	This is head-of-line blocking. It is the single reason HTTP/3 abandoned TCP.
Connection-oriented	Stateful flow control and congestion control for free	3-way handshake adds a full RTT before first byte	Short-lived connections at high churn, handshake cost dominates payload	Connection reuse (keep-alive) amortizes this. Cold connections to far regions hurt most.
Congestion control built in	Fair sharing, backs off under loss automatically	Throughput collapses on the classic loss-equals-congestion assumption	Wireless loss that is not congestion still triggers backoff, tanking throughput	BBR vs CUBIC matters here. Wireless loss fooling CUBIC is a real on-call latency mystery.
Kernel-implemented	Battle-tested, ubiquitous, hardware-offloaded	Protocol changes need OS upgrades, not app deploys	You want a transport fix but are pinned to the fleet's kernel version	QUIC moved to user space precisely to escape this upgrade-cycle trap.
Byte stream abstraction	Simple mental model, no message framing to manage	No message boundaries, app must frame	You assume one send equals one recv, then two messages coalesce	Every junior bug of "my message got split" is this. TCP is a pipe, not a packet.
Single ordered stream	Trivial ordering guarantees	Cannot multiplex independent streams without HoL coupling	HTTP/2 multiplexes logically but one lost packet blocks all streams	The reason HTTP/2's multiplexing promise breaks under packet loss.
Mature middlebox support	Firewalls, NATs, LBs all understand TCP	Ossification, middleboxes mangle anything non-standard	Rolling a custom transport, middleboxes silently break it	Protocol ossification is why QUIC encrypts almost everything, to hide from meddling middleboxes.

UDP

Use when latency and message independence matter more than delivery guarantees, or when a higher-level protocol such as QUIC or WebRTC owns reliability.

Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
No handshake	Zero setup RTT, send immediately	No connection state, no built-in security context	You need auth/encryption and now must build a handshake anyway	QUIC adds back a 1-RTT (or 0-RTT) handshake on UDP, proving you usually need it.
No reliability guarantee	No retransmit stalls, lost data is just gone	App must detect and handle loss itself	You assumed delivery, packets vanish silently on congested links	Correct for live media (stale frame is worthless). Wrong for anything that must arrive.
No ordering	No HoL blocking, each datagram independent	App must reorder or tolerate disorder	Sequence-sensitive logic without sequence numbers	This independence is exactly what QUIC exploits for per-stream delivery.
Datagram boundaries preserved	One send equals one recv, clean message framing	Capped at MTU (~1500B) before fragmentation	Payloads over MTU fragment, and a single fragment loss drops the whole datagram	Opposite of TCP's framing problem. Keep datagrams under path MTU to avoid fragmentation.
Stateless, low overhead	Massive fan-out (DNS, multicast), tiny per-packet cost	No flow/congestion control, can swamp the network	Unthrottled UDP becomes a self-inflicted DDoS or amplification vector	UDP amplification is a top reflection-attack primitive. Rate-limit and validate source.
Firewall-blocked on many networks	N/A — this is purely a cost	UDP/443 blocked on many corporate and hotel networks	WebTransport/HTTP-3 handshake fails fast, must fall back to TCP	The real reason QUIC-based protocols always need a TCP fallback path in production.
Thin protocol surface	Total control to build exactly the transport you want	You reimplement everything TCP gave you for free	Hand-rolled reliability has subtle bugs TCP solved decades ago	Reach for QUIC before hand-rolling reliable UDP. That mistake has been made enough times.

Security TLS

TLS (1.2 / 1.3)

Use by default anywhere confidentiality, integrity, server identity, or mTLS-based workload identity matter; treat certificate automation as part of the system.

Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
Encryption + integrity	Confidentiality and tamper detection on the wire	CPU for crypto, though AES-NI makes it cheap	Very high connection churn without session resumption	At scale, terminate TLS at the edge/LB, not every microservice, unless you need mTLS internally.
Handshake cost	1-RTT in TLS 1.3, down from 2-RTT in 1.2	Still an RTT (or zero with 0-RTT resumption)	Cold connections to distant regions, handshake dominates a small request	TLS 1.3 0-RTT replays are a real risk for non-idempotent requests. Gate it.
Server authentication	Client verifies server identity via cert chain	PKI operational burden, cert rotation and expiry	A cert expires unnoticed and takes down a service at 3am	Expired-cert outages are among the most common self-inflicted SEVs. Automate rotation (ACME).
Mutual TLS (mTLS)	Both sides authenticated, the zero-trust mesh backbone	Cert distribution and rotation for every workload	Manual mTLS cert management across hundreds of services	This is why service meshes (Istio, Linkerd) exist, to automate mTLS you would never hand-manage.
Forward secrecy	Past traffic stays safe even if long-term key leaks	Ephemeral key exchange per session, slight cost	N/A — mandatory and cheap in TLS 1.3, no reason to skip	TLS 1.3 removed all non-forward-secret cipher suites. Good riddance.
Cipher/version negotiation	Interoperability across old and new clients	Downgrade-attack surface, weak-cipher footguns	Leaving TLS 1.0/1.1 or RC4 enabled for "compatibility"	Pin to TLS 1.2+ minimum, prefer 1.3. Audit cipher suites, do not trust defaults.
Opaque to middleboxes	Privacy, middleboxes cannot read payloads	Loss of network-level inspection and debugging	You need to debug a payload issue but everything is encrypted on the wire	Drives demand for app-level tracing. You cannot tcpdump your way out of a TLS payload bug.

App Transport HTTP/1.1, HTTP/2 & HTTPS

HTTP/1.1

Use as the universal compatibility floor for public APIs and simple request/response systems where debugging and reach matter most.

Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
Text-based protocol	Human-readable, trivial to debug with curl/telnet	Verbose, no header compression, parsing overhead	High-volume APIs where repeated headers waste bandwidth	The debuggability is real value. It is why REST-over-HTTP/1.1 still dominates public APIs.
One request per connection at a time	Dead simple request/response model	App-level head-of-line blocking within a connection	Many small assets, browser opens 6 parallel connections per origin to compensate	The 6-connection-per-origin hack is the whole reason HTTP/2 multiplexing was invented.
Pipelining (rarely used)	Multiple requests without waiting, in theory	Responses still ordered, broken middlebox support	You enable it and a proxy mangles the response order	Effectively dead. Everyone disabled it. HTTP/2 multiplexing replaced the intent.
Stateless	Any server can handle any request, easy horizontal scale	Re-send context (cookies, auth) every request	Large repeated headers on every call add up at volume	Statelessness is REST's superpower for scaling. The header cost is HTTP/2's HPACK target.
Universal support	Works literally everywhere, every client and middlebox	Stuck with 1990s-era inefficiencies	N/A — this is the safe-default cost of ubiquity	When in doubt about reach, HTTP/1.1 is the floor that always works.
Keep-alive connection reuse	Amortizes TCP+TLS handshake across requests	Idle connections hold server resources	Connection pool exhaustion under high concurrency	Tune keep-alive timeouts and pool sizes. Default pools are often wrong for your traffic shape.
Chunked transfer encoding	Stream a response of unknown length	No true server push, client must poll or long-poll	Real-time needs force long-polling hacks	SSE is built on this exact mechanism, a never-closing chunked HTTP response.

HTTP/2

Use when multiplexing, header compression, gRPC support, and fewer connections improve internal or client-facing API efficiency.

Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
Stream multiplexing	Many concurrent requests on one connection, no app HoL	All streams share one TCP connection's fate	One lost packet stalls every multiplexed stream (transport HoL)	The multiplexing is real but TCP undermines it under loss. HTTP/3 is the fix.
Binary framing	Compact, fast to parse, enables multiplexing	Not human-readable, needs tooling to inspect	Debugging without proper tools, raw bytes are opaque	This binary base is what gRPC builds on. You cannot do gRPC on HTTP/1.1 framing.
HPACK header compression	Repeated headers cost almost nothing after first request	Compression state is a shared-connection attack surface	N/A in practice — HPACK was hardened after CRIME/HEADER attacks	The win is huge for chatty APIs with fat repeated auth headers.
Server push	Server preemptively sends resources client will need	Pushes waste bandwidth if client already cached them	Cache-blind push re-sends assets the client already has	So problematic that Chrome removed support. Treat server push as effectively deprecated.
Single TCP connection	Fewer handshakes, less server connection overhead	Transport HoL blocking, plus one connection failure kills all streams	Lossy mobile networks, the shared connection becomes a single point of stall	Name the inflection: if your p99 tail is loss-driven on mobile, this is your villain.
Stream prioritization	Hint which streams matter most	Complex, inconsistently implemented across servers	You rely on priorities that your server stack ignores	The HTTP/2 priority tree was so messy HTTP/3 redesigned it entirely.
Requires TLS in practice	Encryption effectively mandatory, ALPN negotiates cleanly	No realistic plaintext HTTP/2 in browsers	Trying to run h2c (cleartext) and finding browsers refuse it	h2c exists for internal/gRPC use. Browsers will not speak HTTP/2 without TLS.

HTTPS (HTTP over TLS)

Use for all public web and API traffic; the real design question is where TLS terminates and whether internal hops need re-encryption or mTLS.

Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
HTTP + TLS composition	Encryption and identity with zero changes to HTTP semantics	Inherits all of TLS's handshake and PKI costs	Cert expiry, handshake RTT, all TLS footguns now apply to your web traffic	HTTPS is not a separate protocol. It is HTTP riding TLS. Understanding that demystifies it.
Browser/SEO mandate	Required for HTTP/2, modern APIs, and search ranking	No realistic plaintext option for public web	N/A — plaintext HTTP is effectively dead for public sites	Let's Encrypt + ACME made the cost near zero. There is no excuse left for plaintext.
Mixed-content blocking	Browser refuses insecure subresources on a secure page	One http:// asset can break a https:// page	Migrating a legacy site, a hardcoded http image breaks the padlock	Audit every subresource on migration. Mixed-content is the classic HTTPS-cutover gotcha.
HSTS enforcement	Forces HTTPS, blocks downgrade and SSL-strip attacks	A misconfig can lock users out for the max-age window	You set a long HSTS max-age then need to serve plain HTTP, too late	Start with a short max-age, ramp up. The preload list is essentially permanent, be sure.
Edge TLS termination	Offload crypto to the LB/CDN, simplify backends	Plaintext on the internal hop unless you re-encrypt	Assuming end-to-end encryption when TLS stops at the edge	Know where TLS terminates. "It's HTTPS" does not mean encrypted all the way to the pod.
SNI exposure	Many certs/hosts on one IP via SNI	Hostname leaks in cleartext during handshake	Privacy or censorship contexts where the SNI reveals the destination	Encrypted Client Hello (ECH) closes this, but rollout is partial. Know your threat model.
Cert as identity	The cert is the trust anchor for the whole session	Compromised or mis-issued cert undermines everything	A CA mis-issues, or a private key leaks	Certificate Transparency logs and CAA records are your monitoring. Use them.

API Style: Request/Response REST, gRPC & GraphQL

REST

Default for public resource APIs, CRUD workflows, cacheable reads, and broad client compatibility.

Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
Resource-oriented + HTTP verbs	Intuitive, maps to CRUD, leverages HTTP caching/status codes	Awkward fit for actions that are not resources	RPC-style operations (sendEmail, recalculate) jammed into REST nouns	The "everything is a resource" model strains for verbs. That friction is gRPC's opening.
Stateless + cacheable	HTTP caching, CDNs, intermediaries all just work	Cache invalidation complexity, repeated context per request	Highly dynamic data where caching gives little benefit	Cacheability is REST's most underrated edge over gRPC and GraphQL. Do not throw it away lightly.
Over/under-fetching	Simple, predictable endpoints	Fixed payloads, clients get too much or too little	Mobile clients dragging huge payloads for one field, or N+1 round trips	This exact pain is what GraphQL was built to solve. Name it before reaching for GraphQL.
JSON payloads	Universal, readable, schemaless flexibility	Verbose, no enforced contract, parse cost	High-throughput internal calls where JSON bloat and parse cost dominate	The lack of an enforced schema is freedom for public APIs, chaos for internal microservices.
Loose contract	Easy to evolve, add fields without breaking clients	No compile-time safety, drift between client and server	A field rename ships and silently breaks downstream consumers	OpenAPI/JSON-Schema bolts on the rigor gRPC has by default. Use it or pay later.
Ubiquitous tooling	Every language, every client, browsers natively	No native streaming, polling for real-time	Real-time features force long-polling or a separate WebSocket/SSE channel	REST is the control plane. Pair it with SSE/WebSocket for the data plane when you need push.
Human-debuggable	curl, Postman, browser devtools, all trivial	Verbosity is the price of readability	N/A — debuggability usually wins for public-facing APIs	Never underrate this. The team velocity cost of opaque protocols is real and recurring.

gRPC

Best inside service meshes and typed service-to-service boundaries where protobuf contracts, deadlines, streaming, and binary efficiency justify the tooling cost.

Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
Protobuf binary contract	Compact payloads, strong typed schema, codegen in every language	Not human-readable, schema-compile step in the build	Quick debugging without grpcurl, raw frames are opaque	The enforced contract is the whole point for internal meshes. It catches drift at compile time.
HTTP/2 native	Multiplexed streams, bidirectional streaming, low latency	Hard dependency on HTTP/2 features browsers cannot reach	Calling gRPC directly from a browser, it simply cannot	Browsers cannot control HTTP/2 frames or read trailers from JS. This is gRPC-Web's reason to exist.
Browser support gap	N/A — this is purely a cost	Needs gRPC-Web + a proxy (Envoy), and even then no client/bidi streaming	You want browser-to-backend gRPC with full streaming, blocked as of early 2026	True bidi streaming needs Fetch duplex, still unshipped in stable browsers in early 2026. Connect-RPC is the pragmatic escape.
Four call types incl. streaming	Unary, server-stream, client-stream, bidi, all first-class	Streaming semantics add real complexity (backpressure, errors)	Treating a long-lived stream like a request, mishandling half-close and flow control	gRPC bidi streaming over HTTP/2 is genuinely powerful server-to-server. Most teams underuse it.
Status in trailers	Can stream N records then report final success/failure	Trailers are inaccessible to browser Fetch	Browser clients cannot read the gRPC status, a core gRPC-Web limitation	This single design choice is why browsers fundamentally cannot speak native gRPC.
Deadlines + cancellation	First-class deadline propagation across the call chain	You must set and propagate them or lose the benefit	No deadlines set, a slow dependency cascades into resource exhaustion	Deadline propagation is a top reason gRPC shines in deep service graphs. REST has no equivalent default.
Tight coupling to schema	Generated clients, no hand-written HTTP plumbing	Proto changes ripple to all consumers, governance needed	A breaking proto change without field-number discipline breaks everyone	Never reuse or renumber proto fields. The wire format is positional, not name-based.

GraphQL

Use when varied clients need different projections over the same graph and you are prepared to own query cost, caching, and resolver complexity.

Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
Client-specified queries	Client asks for exactly the fields it needs, no over/under-fetch	Query complexity moves to the server, unpredictable load	A nested query explodes into thousands of resolver calls	Query cost analysis and depth limiting are mandatory, not optional. Unbounded GraphQL is a DoS vector.
Single endpoint + schema	One graph aggregates many backends, strong introspectable types	HTTP caching mostly breaks, everything is POST to /graphql	You lose CDN/HTTP caching that REST got for free	You rebuild caching at the field level (persisted queries, APQ, DataLoader). Real engineering cost.
Resolver model	Decouples schema from data sources, federate cleanly	N+1 query problem is the default failure mode	Each field hits the DB separately without batching	DataLoader (batch + cache per request) is not optional. It is the first thing every GraphQL backend needs.
Strong type system	Schema is the contract, great tooling and introspection	Schema design and governance overhead across teams	Federated schema with conflicting ownership and naming	Federation (Apollo, etc.) is powerful but the org coordination cost is the real tax.
Mobile/varied-client fit	One API serves wildly different client data needs	Overkill for simple, uniform CRUD	A 3-endpoint service wrapped in GraphQL machinery for no gain	If your clients all want the same shape, GraphQL is pure overhead. Use REST.
Introspection	Self-documenting, powerful dev tooling (GraphiQL)	Exposes your whole schema to attackers if left on	Introspection enabled in prod hands attackers a full API map	Disable introspection in prod or gate it. Common security oversight.
Error semantics	Partial results, per-field errors	Always HTTP 200, errors buried in the body	Monitoring keyed on HTTP status sees 200 while queries fail	Your observability must parse the errors array, not trust status codes. Trips up every new team.

API Style: Streaming / Real-Time WebSocket, SSE & WebRTC

WebSocket

Use for bidirectional realtime sessions where both client and server send mid-stream and you can own connection state, replay, and backpressure.

Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
Full-duplex persistent	Both sides push anytime, low per-message overhead	Stateful connection, must track and scale them	Millions of idle connections pin server memory and FDs	Connection count, not message rate, is the scaling wall. Plan for sticky routing and connection budgets.
Starts as HTTP upgrade	Traverses firewalls/proxies on 80/443, near-universal support	Upgrade can be blocked or stripped by some proxies	Corporate proxy refuses the Upgrade header, connection fails	99%+ browser support since 2015. The reliable real-time default for a reason.
Raw byte/message pipe	Total freedom over message format and protocol	No built-in structure, you build everything (auth, reconnect, heartbeats)	Hand-rolled reconnect/heartbeat logic that is subtly broken	You reinvent a lot. Libraries (Socket.IO) exist precisely because raw WS leaves so much to you.
No auto-reconnect	N/A — this is purely a cost vs SSE	Client must implement reconnect + backoff itself	Network blip drops the socket and nothing reconnects	SSE gives you auto-reconnect free. With WS you own it. A frequent source of "it just stopped" bugs.
Stateful = hard to scale	Cheap once connected, no per-message handshake	Load balancing and horizontal scale get complex	Scaling out means cross-node message routing (pub/sub backplane)	You need Redis/Kafka fan-out to broadcast across server instances. The hidden cost of WS at scale.
Bypasses HTTP semantics	No per-message HTTP overhead	Loses HTTP caching, status codes, standard observability	Your HTTP-based monitoring and tooling go blind on WS traffic	Once you upgrade, you leave the HTTP ecosystem. Budget for separate observability.
One TCP connection	Ordered reliable delivery for free	Subject to TCP head-of-line blocking	Lossy network stalls all messages behind a dropped packet	This TCP HoL limitation is exactly what WebTransport-over-QUIC was designed to escape.

SSE (Server-Sent Events)

Use for one-way server-to-client browser streams such as LLM output, notifications, and live dashboards where writes remain normal HTTP requests.

Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
One-way server push	Dead simple, plain HTTP, perfect for token/event streams	Server-to-client only, client cannot push on the same channel	You need the client to send mid-stream, SSE cannot	The inflection: SSE until the client must push, then WebSocket. This is the cleanest boundary in the whole list.
Built on plain HTTP	Works through proxies/firewalls/CDNs, no upgrade dance	Inherits HTTP/1.1's 6-connection-per-origin limit	Several SSE streams on HTTP/1.1 exhaust the per-origin connection budget	Run SSE over HTTP/2 to multiplex and dodge the 6-connection cap. A real and common gotcha.
Auto-reconnect built in	Browser EventSource reconnects automatically with Last-Event-ID	Reconnect/resume semantics are basic	Complex resume logic beyond simple last-event replay	The free reconnect + event-ID resume is SSE's quiet superpower over raw WebSocket.
Text-only (UTF-8)	Simple line-based wire format, trivial to produce	No binary frames, base64 to ship bytes (~33% bloat)	Streaming binary data forces wasteful encoding	For LLM token streaming (text), this is a non-issue and SSE is near-perfect. For binary, look elsewhere.
Native browser EventSource	No library needed, standardized API	EventSource cannot set custom headers (e.g. auth)	Bearer-token auth on SSE needs query params or a fetch-based polyfill	Auth-header limitation is the most common SSE surprise. fetch-based SSE clients work around it.
HTTP-native scaling	Scales with your existing HTTP/LB stack	Still a long-lived connection holding a server resource	Many concurrent streams tie up worker threads/connections	Lighter than WebSocket per connection but not free. Async/event-loop servers handle it best.
Unidirectional simplicity	Nothing to negotiate, no protocol upgrade	A control channel needs a separate REST call back	You bolt on REST for client-to-server, now two mechanisms	REST (control) + SSE (stream) is a clean, proven pairing. It is exactly the LLM-chat pattern.

WebRTC

Use for peer/media paths, calls, data channels, and low-latency UDP-like delivery where NAT traversal and SFU/TURN planning are part of the product.

Trade-Off	What You Gain	What You Give Up	When It Bites You	PE Nuance
Peer-to-peer	Direct client-to-client path, lowest latency, server offloaded	NAT traversal is hard, needs STUN/TURN infrastructure	Symmetric NATs force TURN relay, killing the P2P bandwidth savings	You still run servers (STUN/TURN). "Serverless P2P" is a myth at scale. Budget for TURN egress.
Built for media	Native audio/video codecs, jitter buffers, echo cancel	Enormous, complex stack for anything non-media	Using WebRTC just for a data channel, paying the whole media tax	If you only need data, WebTransport or WebSocket is far simpler. WebRTC's value is media.
UDP-based (SRTP/DTLS)	Low latency, drops stale frames instead of stalling	UDP blocked on restrictive networks, forced to TCP relay	Corporate firewall blocks UDP, calls degrade to slow TURN-over-TCP	Same UDP-egress problem as WebTransport. Always have a relay fallback.
Mandatory encryption	DTLS/SRTP always on, no plaintext option	No way to disable even for debugging	Hard to inspect media on the wire for diagnostics	Security by default is good. Just know your debugging is at the endpoints, not the wire.
Data channels	Configurable reliable or unreliable, ordered or not, P2P	Heavy signaling + ICE setup before any data flows	Short-lived data needs, setup cost dwarfs the payload	The flexibility (choose reliability per channel) is great, but only worth it if you are already in WebRTC for media.
Signaling not included	You choose the signaling channel (WS, etc.)	You must build session setup yourself	Assuming WebRTC handles connection setup, it does not	WebRTC handles media; you bring the signaling (usually WebSocket). A common architectural surprise.
Mesh scaling limits	Direct paths are ideal for small groups	Full-mesh explodes at O(n^2) connections	Group calls past ~4-6 peers melt client uplinks	Past a handful of peers you need an SFU (selective forwarding unit). Pure mesh does not scale.

2. Use Cases

Five+ real scenarios per protocol. Driving property is specific, not "scalable". Click to sort.

TCP

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
Database connections	Postgres, MySQL wire protocols	Zero tolerance for lost or reordered bytes	Thousands of pooled connections per DB	UDP would corrupt query/result framing
File transfer	SFTP, S3 multipart uploads	Every byte must arrive intact and ordered	Multi-GB objects	UDP needs reliability rebuilt anyway
Web traffic (HTTP/1.1, /2)	Essentially all of REST and the web	Reliable ordered request/response	Internet-scale	HTTP/3 uses UDP, but only after rebuilding TCP guarantees in QUIC
Email transport	SMTP, IMAP	Message integrity over reliability	Billions of messages/day	Loss would silently drop mail
Service-to-service RPC	gRPC over HTTP/2	Ordered multiplexed streams	Internal mesh, 100K+ RPS	Needs the reliable substrate gRPC assumes

UDP

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
DNS lookups	Every resolver on earth	One-shot query/response, retry is cheap	Trillions of queries/day	TCP handshake would triple DNS latency
Live video/audio	WebRTC calls, game voice	Stale frame is worthless, drop beats stall	Real-time, ms-sensitive	TCP retransmit stalls would freeze the call
Online gaming	FPS position updates	Latest state matters, old packets are noise	60+ updates/sec/player	TCP ordering would replay stale positions
QUIC / HTTP/3	Cloudflare, Google edge	Per-stream delivery, no transport HoL	Large fraction of modern web	TCP cannot give independent stream delivery
Telemetry / metrics	StatsD, syslog	High volume, occasional loss tolerable	Millions of datapoints/sec	TCP overhead unjustified for fire-and-forget

TLS

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
Public web (HTTPS)	Every modern website	Confidentiality + server identity	Internet-scale	Plaintext is blocked by browsers and SEO
Service mesh mTLS	Istio, Linkerd	Mutual auth in zero-trust networks	Hundreds to thousands of services	Network-level trust does not survive a breach
API authentication	Any token-based API	Protect bearer tokens in transit	All API traffic	Tokens in plaintext are trivially stolen
VPN / tunnels	OpenVPN, WireGuard-adjacent	Encrypt arbitrary traffic over hostile networks	Org-wide	Unencrypted tunnels defeat the purpose
Compliance (PCI, HIPAA)	Payments, healthcare	Encryption in transit mandated by regulation	Regulated workloads	Non-compliance is a legal/financial risk

HTTP/1.1

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
Simple REST APIs	Most public APIs	Universal reach, trivial debugging	Broad client diversity	HTTP/2 gains marginal for low-concurrency calls
Webhooks	Stripe, GitHub callbacks	One-shot POST any server understands	Millions of events	No need for multiplexing on single calls
Health checks	LB / k8s probes	Dead-simple request, max compatibility	Constant low-volume polling	HTTP/2 overhead pointless for a 200 OK
Legacy/embedded clients	IoT, old SDKs	Works where HTTP/2 cannot	Constrained devices	Many embedded stacks lack HTTP/2
SSE foundation	Any SSE stream	Chunked never-ending response	Per-client streams	SSE rides this exact mechanism

HTTP/2

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
gRPC transport	Internal microservice meshes	Multiplexed bidirectional streams	100K+ RPS internal	HTTP/1.1 cannot frame gRPC
Asset-heavy web pages	Modern SPAs	Many concurrent requests, one connection	Dozens of assets/page	HTTP/1.1 needs 6 connections + HoL
API gateways	Envoy, gateways	HPACK header compression at volume	High-fanout backends	HTTP/1.1 header bloat at scale
Mobile APIs (good networks)	App backends	Connection reuse, lower battery/latency	Millions of devices	HTTP/1.1 wastes RTTs on handshakes
Multiplexed SSE	Multi-stream dashboards	Dodge the 6-connection-per-origin cap	Many streams/client	HTTP/1.1 SSE hits connection limit

HTTPS

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
All public web	Every site post-2018	Encryption + identity, mandatory	Internet-scale	Plaintext HTTP is effectively banned
E-commerce / payments	Checkout flows	Protect payment data in transit	Global retail	PCI forbids plaintext card data
Authenticated apps	Any login-gated app	Protect session cookies/tokens	All user traffic	Session hijacking trivial over HTTP
API endpoints	Public + partner APIs	Confidential request/response	All API calls	Keys/tokens exposed in plaintext
HTTP/2 + HTTP/3 enablement	Performance-focused sites	TLS is the prerequisite for both	High-traffic sites	Browsers will not do h2/h3 without TLS

REST

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
Public APIs	Stripe, Twilio, GitHub	Universal client reach, cacheability, debuggability	Millions of third-party devs	gRPC's browser gap and opacity hurt adoption
CRUD services	Standard backend resources	Clean verb-to-operation mapping	Typical app scale	GraphQL is overkill for uniform shapes
CDN-cacheable content	Product catalogs, media metadata	HTTP caching at the edge	Read-heavy, global	GraphQL POST breaks HTTP caching
Webhooks / integrations	SaaS event delivery	Any HTTP client can receive	Massive partner fan-out	gRPC requires special client tooling
Control plane for streaming	LLM apps (session/auth/history)	Stateless, cacheable, simple	Pairs with SSE data plane	Streaming protocols are wrong for control ops

gRPC

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
Internal microservices	Google, Netflix internal RPC	Typed contracts, low-latency binary, deadline propagation	100K+ RPS, deep call graphs	REST/JSON wastes bandwidth and lacks contracts
Polyglot service meshes	Mixed Go/Java/Python fleets	Codegen from one proto in every language	Hundreds of services	REST means hand-written clients per language
Streaming pipelines	Real-time data services	Bidirectional streaming with backpressure	Continuous high-volume streams	REST has no native streaming
Low-latency mobile (good net)	Performance-critical app backends	Compact protobuf payloads	Millions of devices	JSON parse + size cost on constrained links
Inter-agent / orchestrator calls	Agent platforms (orchestrator to agent svc)	Strong contract + streaming for agent results	Many concurrent agent invocations	REST loses the typed-contract safety net

GraphQL

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
Mobile with varied views	Facebook (origin), Shopify	Each screen fetches exactly its fields, one round trip	Many client versions, slow networks	REST over/under-fetches, multiplies round trips
Backend-for-frontend aggregation	Netflix, GitHub API v4	One graph stitches many microservices	Dozens of backing services	REST forces client-side orchestration
Rapidly evolving frontends	Product teams iterating fast	Add fields without versioning endpoints	Frequent UI changes	REST versioning churn slows iteration
Public data graphs	GitHub GraphQL API	Flexible client queries over rich schema	Large third-party dev base	Fixed REST endpoints constrain consumers
Federated org-wide schema	Large eng orgs (Apollo Federation)	Teams own subgraphs, unified gateway	Many teams, one graph	REST has no native federation story

WebSocket

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
Chat / messaging	Slack, Discord	Bidirectional, both sides push instantly	Millions of concurrent sockets	SSE cannot carry client-to-server messages
Collaborative editing	Figma, Google Docs	Low-latency two-way state sync	Many editors per doc	Polling/SSE too slow and one-directional
Live trading dashboards	Brokerage platforms	Push + client orders on one channel	High-frequency updates	SSE lacks the client-push path
Multiplayer games (web)	Browser games	Real-time bidirectional state	Per-session sockets	WebRTC overkill if no media, SSE one-way
Live notifications + actions	Interactive apps	Server push plus client acks/actions	Per-user connections	SSE forces a second REST channel for actions

SSE

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
LLM token streaming	ChatGPT, Claude-style apps	One-way text stream, HTTP-native, auto-reconnect	Many concurrent sessions	WebSocket adds bidirectional complexity you do not need
Live feeds / tickers	News, sports scores, stock prices	Server pushes updates, client only reads	Broadcast to many clients	WebSocket is heavier for pure push
Notifications	In-app alert streams	Simple server-to-client events	Per-user streams	WebSocket overkill for one-way
Progress / status updates	Long-running job dashboards	Stream progress over plain HTTP	Per-job streams	Polling wastes requests, WS overcomplicated
Server-driven UI refresh	Live dashboards (read-only)	Push data changes through proxies/CDNs	Many viewers	WS upgrade may be blocked by proxies

WebRTC

Use Case	Company / Scenario	Driving Property	Scale Dimension	Why Not Alternative
Video conferencing	Google Meet, Zoom web	P2P/SFU low-latency media with codecs built in	Small groups to SFU-backed rooms	WebSocket has no media stack or NAT traversal
Voice calls	Discord voice, web softphones	UDP media, drop-not-stall, echo cancel	Per-call peers	TCP-based protocols stall on loss
Screen sharing	Remote support tools	Real-time encoded video P2P	1:1 or small group	No other web API ships media handling
P2P file/data transfer	Browser file-sharing tools	Direct client-to-client data channel	1:1 transfers	Server relay adds cost and latency
Cloud gaming / low-latency	Game streaming services	Sub-frame media latency over UDP	Per-session streams	Only WebRTC delivers media-grade latency in-browser

3. Limitations

Severity-rated, with the workaround and what the workaround costs you. Grouped by layer.

Transport + Security

Protocol	Limitation	Severity	Workaround	Workaround Cost
TCP	Head-of-line blocking under packet loss	High	Move to QUIC/HTTP-3 (UDP, per-stream delivery)	UDP egress issues, newer/less-mature stack
TCP	Handshake RTT before first byte	Medium	Connection keep-alive/pooling, TLS 1.3 0-RTT	Pool management, 0-RTT replay risk
UDP	No reliability/ordering/congestion control	High	Build it on top, or use QUIC	Reinventing TCP (badly) or adopting QUIC complexity
UDP	Blocked on many corporate/hotel networks	High	TCP fallback path always required	Maintaining two transport paths
TLS	Cert lifecycle (expiry, rotation)	Critical	Automate with ACME / cert-manager	Automation infra, monitoring, CT log watching
TLS	0-RTT replay attacks	High	Restrict 0-RTT to idempotent requests only	Request classification, careful gating

App Transport

Protocol	Limitation	Severity	Workaround	Workaround Cost
HTTP/1.1	App-level HoL, 6-connection-per-origin cap	High	Upgrade to HTTP/2 multiplexing	TLS requirement, binary debugging
HTTP/1.1	No header compression, verbose	Medium	HTTP/2 HPACK	Migration effort
HTTP/2	TCP transport HoL undermines multiplexing	High	HTTP/3 over QUIC	UDP egress, operational newness
HTTP/2	Server push is cache-blind, deprecated	Medium	Do not use it, use preload hints	Lose the (questionable) feature entirely
HTTPS	Mixed-content breaks on migration	Medium	Audit and rewrite all subresource URLs	Migration audit effort
HTTPS	SNI leaks hostname in cleartext	Medium	Encrypted Client Hello (ECH)	Partial rollout, infra support needed

API Style

Protocol	Limitation	Severity	Workaround	Workaround Cost
REST	Over/under-fetching, N+1 round trips	Medium	GraphQL, or BFF aggregation endpoints	Lose HTTP caching / build BFF layer
REST	No native streaming	Medium	Pair with SSE or WebSocket	Second protocol/channel to operate
gRPC	No native browser support	Critical	gRPC-Web + Envoy proxy, or Connect-RPC	Proxy hop, no client/bidi streaming in browser
gRPC	Opaque to standard HTTP tooling	Medium	grpcurl, server reflection, Connect	Specialized tooling, learning curve
GraphQL	Unbounded query cost (DoS vector)	Critical	Depth/complexity limits, persisted queries	Query analysis infra, allowlist maintenance
GraphQL	N+1 resolver problem	High	DataLoader batching per request	Mandatory extra layer in every backend
GraphQL	HTTP caching mostly breaks	High	Persisted queries, field-level/CDN caching	Rebuild caching you got free in REST
WebSocket	Stateful, hard to scale horizontally	High	Pub/sub backplane (Redis/Kafka), sticky routing	Backplane infra, cross-node fan-out cost
WebSocket	No auto-reconnect or HTTP semantics	Medium	Library (Socket.IO) or hand-rolled logic	Dependency or bug-prone custom code
SSE	One-way only, no client push	Medium	Add REST for client-to-server, or use WebSocket	Two mechanisms, or heavier protocol
SSE	EventSource cannot set auth headers	Medium	fetch-based SSE client, or token in query	Polyfill, or token-in-URL exposure
WebRTC	NAT traversal needs STUN/TURN	High	Run STUN/TURN servers	Infra cost, TURN relay egress at scale
WebRTC	Full-mesh O(n^2) past a few peers	High	SFU (selective forwarding unit)	Media-server infra and cost

4. Fault Tolerance

Reframed for protocols: how each behaves under failure, recovery, and partition. Matrices grouped by layer so no table exceeds 5 columns. Toggle columns with the chips.

Transport + Security

Dimension	TCP	UDP	TLS
Loss recovery	Automatic retransmit	None, app's problem	Inherits transport's
Failure detection	Acks, timeouts, RST	None native	Handshake/cert failures surface immediately
Recovery mechanism	Reconnect + retransmit	App retry logic	Renegotiate / new handshake
Connection migration	No, breaks on IP change	N/A connectionless	No (QUIC adds this on UDP)
Partition behavior	Stalls, then times out	Silent drop	Session breaks, needs re-handshake
Blast radius	One connection's streams	Single datagram	One session
Data loss scenario	Only on hard reset mid-flight	Routine, by design	None added beyond transport

App Transport

Dimension	HTTP/1.1	HTTP/2	HTTPS
Connection-failure blast radius	One request (1 of ~6 conns)	All streams on that connection	Same as underlying HTTP version
Loss behavior	Per-connection stall	Transport HoL stalls all streams	Inherits
Recovery	Open a new connection	Reconnect all streams	+ TLS re-handshake
Failure isolation	Good (independent conns)	Poor (shared fate)	Matches HTTP version
Retry safety	Idempotency-dependent	Idempotency-dependent	Same
Cross-region failover	DNS/LB-driven	DNS/LB-driven	+ cert must cover failover host
Graceful shutdown	Connection close	GOAWAY frame drains streams	Same as HTTP version

API Style

Dimension	REST	gRPC	GraphQL	WebSocket	SSE
Retry model	Idempotent verbs safe to retry	Built-in retry policy + deadlines	Per-query, partial results help	Manual reconnect + replay	Auto-reconnect + Last-Event-ID
Failure detection	HTTP status codes	Status codes + deadlines	Errors array (HTTP 200!)	Heartbeat/ping-pong you build	Connection drop event
Partial failure	All-or-nothing per call	All-or-nothing (stream can partial)	Native partial results	App-defined	Resume from last event
Reconnect cost	New request, cheap	Re-establish stream	New query	Full handshake + state rebuild	Cheap, automatic
State on reconnect	Stateless, none lost	Stream state lost	Stateless	Lost, must resync	Event-ID resume
Timeout handling	Client/LB timeouts	First-class deadline propagation	Resolver-level timeouts	Custom idle timeouts	HTTP timeouts + reconnect
Cascading-failure guard	Circuit breakers (external)	Deadlines stop the cascade	Query timeout + complexity caps	Backpressure you implement	Server can throttle stream

WebRTC omitted from this matrix: its fault model is media-specific (jitter buffers, packet concealment, ICE restart) and does not map to the request/response dimensions above. In short: it tolerates loss by concealing it, recovers paths via ICE restart, and relies on STUN/TURN for connectivity failures.

5. Connection Scaling & Load Balancing

Protocols do not "shard" like databases. The equivalent concern is how connections distribute across servers and where the scaling wall sits. (This repurposes the skill's Sharding slot, per its edge-case rule for non-applicable dimensions.)

Request/Response Styles

Dimension	REST	gRPC	GraphQL
LB granularity	Per-request (any L7 LB)	Per-connection by default, needs L7 for per-call	Per-request (single endpoint)
Stickiness needed?	No, stateless	Connection reuse can pin to one backend	No
Classic balancing pitfall	None major	L4 LB pins long-lived HTTP/2 conns, unbalanced load	One hot query starves a node
Fix	Standard round-robin	L7 LB or client-side LB (xDS/lookaside)	Complexity limits + query routing
Scaling wall	Backend throughput	Connection distribution skew	Resolver/backend fan-out cost
Horizontal scale ease	Trivial (stateless)	Easy, but mind LB layer	Easy at edge, hard at resolvers

The gRPC L4-vs-L7 load-balancing trap is a top PE-interview signal: long-lived HTTP/2 connections defeat naive L4 round-robin, so one backend gets hammered. You need L7 (per-request) or client-side balancing.

Streaming Styles

Dimension	WebSocket	SSE	WebRTC
Per-server limit	Open FDs / memory per socket	Open connections (lighter)	CPU for media, not connections
Stickiness needed?	Yes, connection is stateful	Yes, long-lived stream	Signaling sticky, media P2P/SFU
Cross-node fan-out	Pub/sub backplane (Redis/Kafka)	Same, backplane to broadcast	SFU routes media
Scaling wall	Concurrent connection count	Connection count (higher ceiling)	SFU media bandwidth/CPU
Reconnect storm risk	High (mass reconnect on deploy)	High, but auto-reconnect smooths	ICE restart, moderate
Mitigation	Jittered backoff, conn draining	Jittered reconnect	Staggered renegotiation

For all persistent-connection protocols, connection count (not message rate) is the scaling wall, and deploy-time reconnect storms are the classic outage. Jittered backoff and connection draining are mandatory.

6. Message Fan-Out & Delivery Semantics

The protocol analogue of replication: how a message reaches multiple recipients and what delivery guarantee you get. (Repurposes the skill's Replication slot.)

Transport

Dimension	TCP	UDP
Delivery guarantee	Reliable, ordered, exactly-once-ish in-stream	Best-effort, may drop/reorder/dup
Multicast/broadcast	No, unicast only	Yes, native multicast/broadcast
Fan-out model	One connection per recipient	One packet to many (multicast)
Ordering	Strict in-order	None
Backpressure	Flow control built in	None, can overrun receiver

API Style

Dimension	gRPC	WebSocket	SSE	WebRTC
Native fan-out?	No, point-to-point streams	No, per-connection	No, per-connection	Mesh (small) or SFU
Broadcast mechanism	App + pub/sub backplane	Pub/sub backplane (Redis/Kafka)	Pub/sub backplane	SFU forwards to subscribers
Delivery guarantee	Reliable (HTTP/2 + TCP)	Reliable (TCP)	Reliable + replay via event-ID	Configurable per data channel
Ordering	In-stream ordered	Ordered (TCP)	Ordered	Optional per channel
Replay / resume	App-defined	App-defined	Built-in (Last-Event-ID)	No (media is ephemeral)
Multi-recipient cost	N streams	N sockets + backplane	N streams + backplane	SFU CPU/bandwidth

The recurring theme: none of these API-style protocols fan out natively. Real-time broadcast to many clients always means a pub/sub backplane (Redis, Kafka, NATS) behind your connection servers, or an SFU for media. That backplane, not the protocol, is where your delivery guarantees actually live.

7. Better Usage Patterns

Where PE depth shows. What teams get wrong, the better way, and why it compounds.

TCP / UDP

Pattern	What Most Teams Do Wrong	The Better Way	Why It Matters
Message framing	Assume one TCP send = one recv	Length-prefix or delimiter-frame every message	Coalesced/split reads are the #1 raw-socket bug
Congestion control choice	Leave default CUBIC everywhere	Use BBR on lossy/long-fat networks	CUBIC mistakes wireless loss for congestion, tanking throughput
Connection reuse	New connection per request	Pool and keep-alive	Handshake RTT dominates small requests
UDP reliability	Hand-roll acks/retransmit on UDP	Adopt QUIC instead	Reinventing TCP poorly wastes months and ships bugs
UDP datagram size	Send payloads over MTU	Keep under path MTU (~1200B safe)	Fragmentation means one lost fragment drops the whole datagram

TLS / HTTPS

Pattern	What Most Teams Do Wrong	The Better Way	Why It Matters
Cert rotation	Manual renewal, calendar reminders	ACME automation + expiry alerting + CT monitoring	Expired-cert outages are among the most common self-inflicted SEVs
TLS termination	Assume HTTPS means end-to-end encrypted	Know where TLS terminates, re-encrypt internal hops if needed	Edge termination leaves plaintext on the internal network
Version/cipher policy	Trust defaults, leave old versions on	Pin TLS 1.2 min, prefer 1.3, audit ciphers	Downgrade attacks and weak ciphers are real exposure
0-RTT	Enable globally for speed	Restrict to idempotent GETs	0-RTT replay can double-execute non-idempotent requests
HSTS rollout	Set max long max-age immediately	Ramp max-age, test before preload	A premature long HSTS locks you into HTTPS, hard to undo

HTTP/1.1 & HTTP/2

Pattern	What Most Teams Do Wrong	The Better Way	Why It Matters
Version selection	Force HTTP/2 everywhere	HTTP/2 for many-stream clients, HTTP/1.1 for simple/embedded	HTTP/2 gains are marginal for low-concurrency single calls
Server push	Try to optimize with push	Use preload hints, push is deprecated	Cache-blind push wastes bandwidth, Chrome dropped it
Connection pooling	Default pool sizes untouched	Tune pool size + keep-alive to traffic shape	Pool exhaustion is a silent latency killer under load
HoL awareness	Expect HTTP/2 multiplexing to solve everything	Recognize TCP HoL persists, consider HTTP/3 on lossy paths	Multiplexing breaks under packet loss on mobile
Timeouts	One global timeout	Layered timeouts (connect, read, total) + retries	A single timeout cannot express the real failure modes

REST

Pattern	What Most Teams Do Wrong	The Better Way	Why It Matters
Idempotency	Non-idempotent POSTs with naive retries	Idempotency keys on mutating calls	Retries on timeouts double-charge/double-create otherwise
Pagination	Offset pagination on large sets	Cursor/keyset pagination	Offset degrades and skips/dupes rows as data shifts
Versioning	Break fields in place	Additive changes, version only on true breaks	Silent breaking changes take down consumers
Caching	Ignore HTTP cache headers	ETags, Cache-Control, conditional requests	REST's caching edge is wasted without them
Error contracts	Inconsistent ad-hoc error bodies	Standardized problem+json error shape	Clients cannot handle errors they cannot parse

gRPC

Pattern	What Most Teams Do Wrong	The Better Way	Why It Matters
Deadlines	No deadlines set on calls	Set + propagate deadlines through the chain	Without them, a slow dependency cascades to exhaustion
Load balancing	L4 round-robin on long-lived HTTP/2 conns	L7 or client-side (xDS) balancing	L4 pins connections, hammering one backend
Proto evolution	Reuse/renumber field tags	Never reuse field numbers, reserve removed ones	Wire format is positional, reuse corrupts data
Browser access	Try native gRPC from browser	Connect-RPC (no proxy) or gRPC-Web + Envoy	Browsers cannot speak native gRPC, full stop
Streaming lifecycle	Ignore half-close and backpressure	Handle flow control + clean stream teardown	Leaked/stalled streams exhaust server resources

GraphQL

Pattern	What Most Teams Do Wrong	The Better Way	Why It Matters
N+1 resolvers	One DB call per field	DataLoader batch + cache per request	N+1 is GraphQL's default performance disaster
Query cost	Accept arbitrary queries	Depth + complexity limits, persisted queries	Unbounded queries are a trivial DoS vector
Introspection in prod	Leave it enabled	Disable or gate it in production	It hands attackers your full schema map
Caching	Assume HTTP caching works	Persisted queries + field/CDN caching (APQ)	POST-to-one-endpoint kills normal HTTP caching
Error monitoring	Alert on HTTP status	Parse the errors array (status is always 200)	Failures are invisible to status-based monitoring

WebSocket / SSE

Pattern	What Most Teams Do Wrong	The Better Way	Why It Matters
Protocol choice	Reach for WebSocket by reflex	SSE if push is one-way, WebSocket only if client pushes	SSE is simpler, HTTP-native, auto-reconnecting
Reconnect storms	Immediate reconnect on drop	Jittered exponential backoff	Synchronized reconnect after deploy DDoSes you
Cross-node broadcast	Assume one server holds all clients	Pub/sub backplane (Redis/Kafka/NATS)	Clients on different nodes never see each other's messages
Heartbeats	No keepalive, rely on TCP	App-level ping/pong + idle timeout	Dead connections linger, intermediaries silently drop idle ones
SSE on HTTP/1.1	Many SSE streams on HTTP/1.1	Run SSE over HTTP/2	HTTP/1.1's 6-conn-per-origin cap throttles streams

WebRTC

Pattern	What Most Teams Do Wrong	The Better Way	Why It Matters
TURN planning	Assume pure P2P, skip TURN	Always provision TURN relay + budget egress	Symmetric NATs force relay, no-TURN means failed calls
Group scaling	Full-mesh for group calls	SFU past ~4-6 peers	Mesh is O(n^2), melts client uplinks
Data-only use	Use WebRTC just for a data channel	Use WebTransport/WebSocket instead	The media stack is huge overhead for plain data
Signaling	Expect WebRTC to handle setup	Build signaling (usually over WebSocket)	WebRTC does media, not session establishment
Network resilience	No fallback for UDP-blocked nets	TURN-over-TCP/TLS fallback path	Corporate firewalls block UDP, calls die without it

8. Advanced / Next-Gen Alternatives

Successors, adjacent tech that does it better for specific cases, and patterns that obviate the original. Maturity as of mid-2026.

TCP / UDP / TLS

Successor / Alternative	What It Improves	Maturity	Migration Cost	When To Consider
QUIC (over UDP)	Per-stream delivery, no TCP HoL, 0-RTT, connection migration	Production	Medium, via HTTP/3	Lossy mobile networks, latency-sensitive web
TLS 1.3	1-RTT handshake, forward secrecy mandatory, weak ciphers removed	Production	Low, mostly config	Always, if still on 1.2
MASQUE	Proxying/tunneling over QUIC (modern VPN primitive)	Emerging	High	Privacy proxies, modern tunneling
Post-quantum TLS (ML-KEM hybrids)	Resistance to future quantum key-recovery	Emerging	Medium, hybrid rollout underway	Long-lived secrets, harvest-now-decrypt-later threat models

HTTP/1.1 / HTTP/2

Successor / Alternative	What It Improves	Maturity	Migration Cost	When To Consider
HTTP/3 (over QUIC)	Eliminates transport HoL, faster handshake, connection migration	Production	Medium, needs QUIC-capable stack + UDP egress	Mobile-heavy, lossy networks, tail-latency-sensitive
gRPC (on HTTP/2)	Typed contracts + streaming on top of HTTP/2	Production	Medium	Internal service-to-service
HTTP/3 0-RTT	Near-instant reconnection for return visitors	Production	Low once on HTTP/3	Repeat-visit latency optimization

REST / gRPC / GraphQL

Successor / Alternative	What It Improves	Maturity	Migration Cost	When To Consider
Connect-RPC	gRPC semantics that work natively in browsers, no Envoy proxy, debuggable	Production	Low if already on protobuf	Browser clients needing gRPC-style contracts
tRPC	End-to-end TypeScript type safety, no codegen	Production	Low (TS-only stacks)	Full-stack TypeScript monorepos
GraphQL Federation	Org-scale schema composition across team-owned subgraphs	Production	High (org coordination)	Many teams, one unified graph
gRPC-Web	Browser access to gRPC backends (unary + server-stream)	Production	Medium (proxy required)	Existing gRPC backend, browser must call it

WebSocket / SSE / WebRTC

Successor / Alternative	What It Improves	Maturity	Migration Cost	When To Consider
WebTransport (over HTTP/3)	Multiplexed streams + unreliable datagrams on one QUIC connection, no TCP HoL	Emerging (~75% browser, Safari 26.4 closed the gap)	High (HTTP/3 server, UDP egress, fallback)	Need mixed reliable/unreliable channels, lossy networks; keep WebSocket fallback
WebSocket over HTTP/3 (RFC 9220)	WebSocket semantics on QUIC, dodging TCP HoL	Early (no major browser/server shipped as of 2026)	Unknown, not yet practical	Future, when ecosystem ships it
WebRTC SFU architectures	Scales group media past mesh limits	Production	Medium (run/buy an SFU)	Group calls beyond a handful of peers
WebTransport datagrams (vs WebRTC data)	Simpler low-latency data path without WebRTC's media stack	Emerging	Medium	Low-latency data-only needs (games, telemetry)

Best default choices

Search and compare

1. Trade-Offs

Transport TCP & UDP

TCP

UDP

Security TLS

TLS (1.2 / 1.3)

App Transport HTTP/1.1, HTTP/2 & HTTPS

HTTP/1.1

HTTP/2

HTTPS (HTTP over TLS)

API Style: Request/Response REST, gRPC & GraphQL

REST

gRPC

GraphQL

API Style: Streaming / Real-Time WebSocket, SSE & WebRTC

WebSocket

SSE (Server-Sent Events)

WebRTC

2. Use Cases

TCP

UDP

TLS

HTTP/1.1

HTTP/2

HTTPS

REST

gRPC

GraphQL

WebSocket

SSE

WebRTC

3. Limitations

Transport + Security

App Transport

API Style

4. Fault Tolerance

Transport + Security

App Transport

API Style

5. Connection Scaling & Load Balancing

Request/Response Styles

Streaming Styles

6. Message Fan-Out & Delivery Semantics

Transport

API Style

7. Better Usage Patterns

TCP / UDP

TLS / HTTPS

HTTP/1.1 & HTTP/2

REST

gRPC

GraphQL

WebSocket / SSE

WebRTC

8. Advanced / Next-Gen Alternatives

TCP / UDP / TLS

HTTP/1.1 / HTTP/2

REST / gRPC / GraphQL

WebSocket / SSE / WebRTC