L7 Proxies / Load Balancers / API Gateways

Principal Engineer trade-off analysis across Nginx, HAProxy, Envoy, Pingora, and Kong.

Category Sweep L7 Data Plane API Gateway

As of 2026-06-02

PE Verdict

The proxy market split into three lanes by 2026. HAProxy stays the throughput king for stable backend topologies (around 42K RPS per Kubernetes ingress benchmark, lowest p99). Envoy won the service mesh and AI gateway race on the strength of xDS, dynamic reconfiguration without restart, and a healthy filter chain (WASM, ext_proc, Lua). Pingora is the post-Nginx future for organizations that own their proxy code: Cloudflare reports roughly 70% memory and 60% CPU reduction versus Nginx at trillion-request-per-day scale.

Nginx remains dominant by deployment count, but governance pressure is real. The community-maintained ingress-nginx Kubernetes project is being retired in March 2026, and developer-led forks (Freenginx, Angie) emerged after F5 acquisition friction. Treat Nginx as the safe incumbent for greenfield small/medium workloads but assume migration risk on the 3-5 year horizon. Kong is the right choice when you want a pre-built plugin marketplace (auth, rate limit, transformations) and accept the OpenResty/Lua event-loop ceiling under CPU-bound plugins.

The non-obvious call: if you are doing AI inference gateways in 2026, start with Envoy AI Gateway, not Kong. Token-aware rate limiting and per-model priority routing land natively on Envoy ExtProc, and the ecosystem is moving there.

Best default choices

1. Trade-Offs (per technology)

Each row is a deliberate "X for Y" choice. Gains are specific. Costs are operational. PE Nuance is what most engineers miss until on-call.

Nginx

Default for static serving combined with reverse proxying on teams that know it well; governance risk from F5 is real — anyone on ingress-nginx in Kubernetes needs a migration plan before March 2026.

Trade-OffWhat You GainWhat You Give UpWhen It Bites YouPE Nuance
Static config + reload model over dynamic APISimple mental model, predictable behavior, file-based GitOps works triviallyNo service-discovery-driven endpoint updates without 3rd party (consul-template, nginx-plus API, nginx-mesh)Microservices fleet with 200+ services churning endpoints every minute. Reload storms cause connection drops and 502s during deploys.Open-source Nginx reload spawns new workers and drains old ones, but during the drain window memory doubles. At 5K+ worker config, this is enough to OOM on small instances.
C-based event loop over thread pool modelTiny memory per connection (around 2-4 KB), 50K+ idle connections per workerSingle CPU-bound module call blocks the whole worker, kneecapping latency for every other connection on that workerA Lua script doing JSON parsing on every request. p99 latency triples under load even though average looks fine.Memory CVEs in C-based proxies are a structural risk, not a tooling problem. The post-2020 push to Rust (Pingora, linkerd2-proxy) is precisely a response to this.
Lua plugin model via OpenResty over native modulesCustomization without C compilation, large ecosystem (Kong sits on this)LuaJIT memory ceiling around 2 GB per worker, CPU-bound Lua starves the event loopHeavy auth plugin under traffic spike. Workers run Lua GC, p99 spikes 10x for the duration.OpenResty trace flags (lj-arch, lj-state) are mandatory for any serious Lua deployment. Most teams discover this only after the first production incident.
Synchronous worker model over async I/ODeterministic CPU pinning, NUMA-friendly, easy capacity planningConnection reuse across workers is impossible without shared memory tricksTLS handshake cost compounds: 4 workers with separate TLS session caches means 4x the handshake load on origin.This is the specific reason Cloudflare built Pingora. At 1T requests/day, per-worker isolation became dominant overhead.
F5 corporate stewardship over community-led projectCommercial support, FIPS builds, dedicated security team, integration with F5 BIG-IPStrategic direction outside community control (Freenginx fork, Angie fork)Critical CVE released against a feature you depend on, and disclosure timing was a corporate decision rather than community-aligned.For 2026+ greenfield projects, this is a real signal. The retirement of ingress-nginx in March 2026 is the canary. Anyone deploying Nginx in k8s should be planning a migration path.
Battle-tested over feature-current15+ years in production, every edge case documented somewhere on Stack OverflowHTTP/3 support arrived late and is still flagged experimental in many builds; gRPC support is functional but unlovedGreenfield team wants HTTP/3 + gRPC streaming + WebSocket multiplex in one proxy. Nginx is the wrong starting point."Battle-tested" cuts both ways at PE level. The code is stable but conservative. Innovation has migrated to Envoy and Pingora.
Single binary over frameworkConfigurable as a web server, reverse proxy, mail proxy, TCP load balancer, all from one binaryJack-of-all-trades architecture leaves performance gaps versus specialized proxiesYou need pure L4 load balancing at line rate. HAProxy or kernel-bypass options beat Nginx by 30-40%.The breadth was a 2010-era win when teams had one tool budget. In 2026, separation of concerns (Envoy for mesh, Nginx for web, HAProxy for L4) is the more defensible architecture.
Configuration DSL over declarative APIPowerful pattern matching with map/geo/regex, conditional logic via ifConditional if blocks are notoriously broken in location contexts ("if is evil" is official guidance)New team member writes a seemingly correct if-based redirect rule, and traffic silently disappears for certain header combinations.The Nginx if behavior is one of the most-cited examples of a config DSL leaking implementation details to operators. Compare to Envoy's RouteConfiguration which is explicit but verbose.

HAProxy

Default for pure load balancing workloads where throughput ceiling and sub-ms tail latency matter more than dynamic routing, plugin extensibility, or service mesh integration.

Trade-OffWhat You GainWhat You Give UpWhen It Bites YouPE Nuance
Specialized load balancer over web serverBest-in-class throughput, around 42K RPS per Kubernetes ingress benchmark, around 50% CPU vs 73% for Envoy at the same loadNo static file serving, no caching layer (must front it with Varnish or Nginx if needed)You wanted "one proxy" that also serves static assets. HAProxy is not that. You end up running HAProxy + Nginx + Varnish.The narrow scope is the reason for the throughput. At PE scale, this is a feature: do one thing, do it at line rate.
Configuration file over dynamic API (with Runtime API as escape hatch)Predictable, GitOps-friendly, no control plane needed for stable backendsDynamic endpoint updates require Runtime API socket commands or DPAPI (HAProxy 2.4+)Service mesh use case with constantly-changing endpoint lists. Possible but feels grafted on, not native.The Data Plane API gets HAProxy closer to Envoy parity, but the design center is still static config. If you need xDS-grade dynamism, you are fighting the tool.
Multi-threaded over single-threaded with workersTrue multithreading (since HAProxy 1.8), better core utilization, shared session tableThread synchronization overhead under heavy lock contention on shared mapsSticky session table with millions of entries and high churn. Lock contention becomes visible.HAProxy's threading is more sophisticated than Nginx's worker isolation but less aggressive than Envoy's per-thread isolation. The middle path is good for L4 but compromises L7 customization.
Mature TCP/HTTP focus over service mesh feature setRock-solid L4 (TCP proxy), unbeatable for non-HTTP protocols (MySQL, PostgreSQL, Redis)No native gRPC streaming optimization, no service mesh sidecar story, limited service discovery integrationInternal microservices comm needs proxying. HAProxy works but Envoy is the architecturally aligned choice.HAProxy at the edge plus Envoy in the mesh is a common, defensible PE-level topology. Don't try to make HAProxy do mesh work.
Built-in observability (stats page, Prometheus exporter) over distributed tracingExcellent low-cost metrics out of the box, robust admin socket for live introspectionDistributed tracing integration is grafted on, weaker than Envoy's native OpenTelemetryYou need per-request span context propagation across a 12-hop service path. Envoy gives this natively; HAProxy needs work.HAProxy 3.2 (Oct 2025) significantly improved tracing, but the design center is still aggregate metrics, not request-level traces.
ACL-based routing over per-route filter chainsHighly composable, fast to evaluate, easy to read once you internalize the syntaxComplex per-route policy chains require ACL composition that gets unreadable past 5-6 conditionsAuth policy varies by tenant, region, and feature flag. The ACL config grows to thousands of lines and ownership becomes opaque.For routing logic that complex, you have outgrown HAProxy and should be on a gateway (Kong, Envoy AI Gateway) where policy is a first-class data model.
Open source with commercial HAProxy Enterprise optionVendor model is rational: enterprise features (WAF, advanced rate limiting, admin UI) cost money, core is fully freeSome operationally-valuable features (multi-cluster sync, dashboard) are paid tier onlyYou build a free-tier deployment, hit operational friction, and the upgrade path is "buy HAProxy Enterprise" with non-trivial pricing.Compared to Nginx (F5 acquisition friction) and Envoy (no commercial entity), HAProxy's commercial model is the most predictable. That stability is itself a PE-relevant signal.

Envoy

Default for service meshes and AI inference gateways where xDS dynamic config, filter chain extensibility, and native OpenTelemetry justify the operational cost of running a control plane.

Trade-OffWhat You GainWhat You Give UpWhen It Bites YouPE Nuance
xDS dynamic config over static filesZero-restart endpoint, route, listener, and secret updates from a control plane. The defining feature.You need a control plane (Istio, Consul, Envoy Gateway, custom). Significant operational surface.Greenfield team picks Envoy without realizing they also need to operate a control plane. Most of the cost is there, not in Envoy itself.Incremental xDS (Delta) is now default in Istio 1.22+. If you are on classic SotW (State of the World), config push storms on large meshes are still a real failure mode.
C++ over Rust or CMature C++17, well-trodden ground for high-performance networking, large contributor poolMemory-safety class of CVEs still possible, harder to onboard contributors versus Go or RustCustom filter authored in C++ has a use-after-free bug, hits production after fuzz testing missed the edge case.The shift from C++ to Rust at the proxy layer is a multi-year trend (Pingora, linkerd2-proxy). Envoy is unlikely to follow but the gap will matter for security-conscious orgs.
Filter chain extensibility over single-purpose proxyWASM, Lua, ext_proc (gRPC), ext_authz, native C++ filters: any logic at any pointWASM filters have non-trivial perf cost (often 20-40% latency add), ext_proc adds a network hopAuth via ext_proc adds 5-15ms p99 per request. At 50 RPS per user, that's a noticeable user-facing slowdown.ext_proc is the right choice for stateful auth where logic changes weekly. WASM is the right choice for stable, perf-sensitive logic. Native C++ filters are the right choice when both apply and you can staff the maintenance.
Per-thread isolation over shared worker modelNo locks on hot paths, predictable tail latency, easy to reason about thread pinningConnection pool fragmentation across threads, more memory overhead than NginxHigh keepalive-heavy workload. Each thread maintains its own pool, total memory usage scales with thread count.Connection pool sharing is one of the specific pain points Pingora was designed around. Envoy has improved (connection_pool_per_downstream_connection options) but it's not as elegant as Pingora's lock-free hot pool.
Native observability over add-on metricsBuilt-in OpenTelemetry, statsd, Prometheus, admin endpoint with config_dump for diagnosticsHigh cardinality stats can balloon memory; admin endpoint is a security risk if exposedProduction Envoy with thousands of clusters and per-route stats hits 4-6 GB memory just for the stats subsystem.Use match-based stats inclusion lists in production, never the default. Most teams find out about stats cardinality cost only after the OOM.
Service mesh DNA over standalone proxyCleanest fit for mTLS-everywhere, identity-based routing, traffic shifting, fault injectionStandalone use feels heavyweight; you carry mesh complexity even if you only need an LBSingle-team need for a basic reverse proxy ends up running an Envoy + control plane stack worth a quarter of engineering time.For single-LB use cases, Envoy is overkill. For multi-team microservices with identity, traffic policies, and observability requirements, nothing else competes.
HTTP/2, HTTP/3, gRPC first-class over HTTP/1.1 firstModern protocols natively, gRPC streaming, HTTP/3 QUIC stableSome HTTP/1.1 edge cases (chunked encoding, trailer handling) get less attentionLegacy HTTP/1.1 client with non-RFC-compliant headers (common in mobile SDKs and old CDN paths) gets rejected or misparsed.Cloudflare specifically called this out as a reason for Pingora: "bizarre and non-RFC compliant HTTP traffic." If your traffic includes a lot of weird clients, Envoy is stricter than Nginx, which is stricter than Pingora.

Pingora

Default for teams building a custom proxy product in Rust, where the lock-free cross-thread connection pool and memory safety by construction justify the absence of declarative config.

Trade-OffWhat You GainWhat You Give UpWhen It Bites YouPE Nuance
Library/framework over binary productYou compose the proxy you want from Rust callbacks; total control over request lifecycleNo drop-in nginx.conf equivalent; every deployment is a custom Rust applicationTeam wants a quick reverse proxy with a YAML config. Pingora is the wrong choice for that use case.Pingora competes with libraries (Tower, hyper) more than with binary proxies. The right comparison for "deploy a proxy" is Nginx/Envoy/HAProxy. The right comparison for "build a custom proxy product" is Pingora.
Rust memory safety over C/C++ performance ceilingEliminates buffer overflow, use-after-free, dangling pointer CVE classes by constructionSteeper learning curve, smaller hiring pool versus C++ or GoTeam needs to extend the proxy and you discover Rust async (Tokio runtime, lifetime juggling) is a real cliff for new engineers.The hiring constraint is real. If you don't have at least 2 senior Rust engineers, picking Pingora means paying ramp cost in addition to development cost.
Async multithreaded over per-process workersCloudflare reports about 70% memory and 60% CPU reduction versus their Nginx baseline at trillion-request-per-day scaleTokio async-Rust has its own learning curve and debugging surface (stack traces are async-aware but uglier)Race condition in shared state shows up only at high concurrency. Reproducing in test is hard.The async-Rust ecosystem matured significantly 2024-2026 (tokio-console, tracing). It's no longer the "experimental" tier it was in 2022.
Lock-free connection pool shared across threads over per-worker poolsMassively higher connection reuse, fewer TLS handshakes to origin, lower tail latency on warm pathsImplementation complexity is real; lock-free data structures are a source of subtle correctness bugsCustom transport layer extension introduces a race that triggers only under specific connection-recycling patterns.This is the specific design choice that makes Pingora superior to Nginx at Cloudflare scale. Lower-scale orgs may not see the same gains; the architecture pays off most where origin TLS cost is dominant.
Programmable callbacks over declarative configTotal flexibility: every phase (request, upstream selection, response, logging) is a Rust function you ownNo GUI, no YAML, no "ops team configures it" model. Every change is a code change and a deploy.Non-engineer ops team needs to add a redirect rule. With Nginx they'd edit a file; with Pingora they file a ticket.For the "platform team builds proxy, app teams consume" model this is a feature, not a bug. For the "self-service via config" model, it's wrong.
Cloudflare-led roadmap over CNCF or community governanceBattle-tested at Cloudflare's 40M+ HTTP requests per second; bugs found in real traffic, fixed fastPre-1.0 API stability not guaranteed; roadmap optimizes for Cloudflare's needs firstYou bet on a specific Pingora API, then Cloudflare changes the trait signature in v0.8 and your downstream code breaks.ISRG (Internet Security Research Group) is funding adoption and the River project to provide a higher-level abstraction. The ecosystem is maturing but it is not yet ready for "anyone can adopt" status.
HTTP/1, HTTP/2, gRPC, WebSocket support over universal protocolCovers the proxy protocols 95% of services needHTTP/3 / QUIC was on the 2025 roadmap, status varies by buildTeam needs production HTTP/3 today; check current Pingora support before committing.If HTTP/3 is mandatory, Envoy and Nginx Plus are ahead. If you can wait 6-12 months, Pingora is catching up fast.

Kong

Default for REST API gateways where auth (OIDC, JWT), rate limiting, request transformations, and AI routing are needed from a plugin marketplace rather than built from scratch.

Trade-OffWhat You GainWhat You Give UpWhen It Bites YouPE Nuance
API gateway abstraction over raw proxyFirst-class concepts (Service, Route, Consumer, Plugin) match how API teams think about gatewaysHeavier than Nginx alone; you carry the gateway abstraction even when you only need a reverse proxySingle-service deploy uses Kong because "it's the standard," carries database + admin API + control plane overhead unnecessarily.If the answer to "what does this proxy do?" is "route traffic," skip Kong. If it's "auth, rate-limit, transform, observe, then route," Kong starts to pay off.
OpenResty / Lua plugin model over native or WASMPlugins are quick to write, hot-reload friendly, accessible to non-systems engineersLua plugins share Nginx's event-loop ceiling; CPU-bound plugins (regex, JWT verify, JSON transform) starve other requestsHeavy auth plugin under traffic spike, p99 latency spikes for the entire gateway, not just authed requests.Plugin ordering matters; light filtering plugins (ACL, IP restriction) should run before heavy plugins. This is operator knowledge that doesn't show up in docs prominently enough.
DB-backed config (Postgres) over DB-less modeMulti-node config sharing, admin API works across cluster, declarative GitOps via decKPostgres becomes a critical dependency; outage of Postgres can degrade Kong control plane (data plane usually stays up)Postgres connection pool exhausted under config-change storm during a deploy, admin API stops responding.DB-less mode (config from YAML/JSON, no Postgres) is the right default for new deployments. DB-backed mode is legacy and adds operational surface most teams don't need.
Open-source core with Kong Enterprise paid featuresFree tier is functional for most use cases (basic auth, rate limit, transformations)The plugins most teams reach for at scale (OIDC, advanced rate limiting, plugin ordering, FIPS) are enterprise-onlyTeam builds on free tier, hits an OIDC requirement, has to either reimplement or upgrade. Upgrade cost is significant.Kong's commercial model is more aggressive than HAProxy's. Plan for the enterprise upgrade decision as part of adoption, not as a surprise later.
L7 HTTP/REST/WebSocket focus over universal L4 + L7Optimized for the dominant use case (REST API gateway)Native gRPC, TCP, UDP support is limited; for pure L4 work, HAProxy or Envoy is betterTeam needs gRPC streaming with bidirectional flow and full HTTP/2 features. Kong gets you 70% there, the rest is a struggle.For full gRPC service mesh, Envoy is the right tool. Kong is for REST APIs and is unapologetic about it.
Plugin marketplace over build-everything-yourself300+ plugins covering common needs (JWT, OAuth2, OIDC, Datadog, transformations, Kafka, AI plugins)Plugins vary in quality; most-needed plugins (advanced rate limit, OIDC) are enterprise; some community plugins are unmaintainedTeam picks a community plugin, it has a memory leak, no maintainer to fix it.Evaluate plugins like dependencies: maintained, tested at your scale, security-reviewed. The "300+ plugins" headline number hides quality variance.
AI Gateway features added 2024-2026 over pure REST gatewayMulti-LLM routing, semantic caching, prompt guards, AI-specific rate limiting now in KongEnvoy AI Gateway is also evolving fast, with deeper integration into the Envoy ext_proc modelBet on Kong AI plugins, then Envoy AI Gateway ships features Kong doesn't have for 12 months.In 2026, both products are racing for the AI gateway segment. Pick based on existing infrastructure (Envoy mesh → Envoy AI Gateway; Kong-centric API platform → Kong AI plugins).

2. Use Cases (per technology)

Driving Property is the single attribute that ruled out alternatives. Why Not Alternative is the specific reason another reasonable choice was rejected.

Nginx

Default for static serving combined with reverse proxying on teams that know it well; governance risk from F5 is real — anyone on ingress-nginx in Kubernetes needs a migration plan before March 2026.

Use CaseCompany / ScenarioDriving PropertyScale DimensionWhy Not Alternative
Static site + reverse proxy on a single VMMid-size SaaS shipping a marketing site plus APIOne binary serves static, TLS-terminates, and proxies to backend5-20K RPS per node, single instanceHAProxy doesn't serve static; Envoy needs a control plane; Kong is overkill
Ingress controller for legacy Kubernetes deploymentsHundreds of teams on the soon-retiring ingress-nginx project (community version retires March 2026)Familiar Ingress resource model, mature ecosystem10K-50K services across thousands of clusters globallyHAProxy ingress is faster but less ubiquitous; F5 NGINX Ingress is the migration target now that community version is retiring
TLS termination + caching layer in front of originMid-size media site with CDN miss pathBuilt-in HTTP cache, microcaching for hot paths, low memory30K RPS sustained, 200 GB local cacheVarnish has a stronger cache model but a separate config language; Nginx wins on operational simplicity
Lua-extended platform layerInternal platform team extending Nginx via OpenResty for custom routingLua scripting at every phase, runtime modification of behavior50K RPS with custom logic on every requestEnvoy WASM has higher overhead per call; Kong adds gateway abstractions you don't need
Multi-tenant SaaS L7 routing by Host headerSaaS with thousands of customer subdomains routed to backendsmap directive scales to millions of entries with predictable lookup cost20K+ tenant domains, single-digit ms p99 routing decisionsHAProxy's ACL-based routing is comparable but less ergonomic; Envoy needs runtime config sync

HAProxy

Default for pure load balancing workloads where throughput ceiling and sub-ms tail latency matter more than dynamic routing, plugin extensibility, or service mesh integration.

Use CaseCompany / ScenarioDriving PropertyScale DimensionWhy Not Alternative
L4 database load balancer (MySQL, PostgreSQL, Redis)Fintech routing reads across replicas, writes to primaryTCP proxy at line rate with health-check awareness50K connections, sub-ms proxy overheadNginx stream module works but lacks the same health-check granularity; Envoy's L4 story is functional but secondary
High-throughput Kubernetes ingressWorkloads where ingress is a bottleneck42K RPS per benchmark, lowest p99 across competitors10K+ pods, hundreds of services per clusterNginx ingress retiring March 2026; Envoy uses more CPU at the same throughput
Active-passive load balancer for legacy enterprise appsBanks, telcos with stateful TCP services in front of WebSphere or similarMature health checks, session persistence, predictable failoverHundreds of backends per LB, multi-decade uptime expectationsF5 hardware LB costs more; Nginx requires Plus tier for advanced health checks
Edge LB for trading platformsLow-latency financial venues (named: Deutsche Bank type deployments)Lowest tail latency under heavy concurrency, predictable jitterSub-ms p99 at 50K+ concurrent connectionsEnvoy's filter chain adds latency; Nginx tail latency degrades faster under stress
SSL/TLS terminator front end with WAF integrationCompliance-heavy industry (PCI, HIPAA) needing rigorous TLS terminationFIPS-validated builds (HAProxy Enterprise), tight observability of TLS handshake metrics5-20K TLS handshakes/sec with full mTLSNginx Plus is comparable but more expensive; OpenResty-based Kong adds Lua overhead on the TLS hot path

Envoy

Default for service meshes and AI inference gateways where xDS dynamic config, filter chain extensibility, and native OpenTelemetry justify the operational cost of running a control plane.

Use CaseCompany / ScenarioDriving PropertyScale DimensionWhy Not Alternative
Service mesh data planeIstio, Consul Service Mesh, AWS App Mesh usersxDS dynamic config, mTLS, identity-based routing, traffic shifting10K+ sidecar instances per mesh, sub-second config propagationLinkerd2-proxy is leaner but less feature-rich; Nginx mesh products are second-tier
API gateway with dynamic routingCloud-native companies (Lyft was the origin), platform teams building self-service APIsFilter chain extensibility (WASM, ext_proc), HTTP/3, gRPC streaming100K RPS, hundreds of upstream clusters, route changes per minuteKong is more abstraction-heavy but easier to onboard; HAProxy lacks dynamic config story
AI inference gatewayTeams routing across LLM providers (OpenAI, Anthropic, internal models)Per-model rate limiting, token-aware quotas, ExtProc for prompt processing, semantic routingTens of thousands of LLM requests/sec, multi-second response streamingKong AI plugins are catching up; native ext_proc integration in Envoy AI Gateway is currently ahead
Multi-region edge gatewayGlobal SaaS routing to nearest region with health-aware failoverLocality-aware load balancing, outlier detection, circuit breakers built-in50+ regions, p99 routing decision under 1msHAProxy lacks built-in locality awareness; Nginx requires plugins/scripting
Observability-heavy enterprise gatewaySREs needing per-request tracing with OpenTelemetry to Tempo or JaegerNative OpenTelemetry exporters, distributed tracing built into the request lifecycle10K services with full trace propagation, 1B+ spans/dayHAProxy tracing is newer and less feature-complete; Nginx tracing requires plugins

Pingora

Default for teams building a custom proxy product in Rust, where the lock-free cross-thread connection pool and memory safety by construction justify the absence of declarative config.

Use CaseCompany / ScenarioDriving PropertyScale DimensionWhy Not Alternative
Hyperscale edge proxyCloudflare's own edge networkLock-free connection pool across threads, 70% memory and 60% CPU reduction vs Nginx40M+ HTTP req/sec globally, 1T+ requests/dayNginx hit per-worker isolation ceiling; Envoy is more memory-heavy at the same throughput
Custom proxy productCDN companies, security vendors building proxy-as-a-productRust library/framework model gives full control over request lifecycleVaries by product; typically 10K-1M+ RPS per nodeBuilding on Nginx or Envoy means fighting their config models; Pingora is designed to be embedded
Memory-safety-critical edgeSecurity vendors, government infrastructure, compliance-heavy industriesRust eliminates entire classes of memory-safety CVEs by constructionComparable to Nginx workloads with stronger CVE postureC-based proxies carry residual memory-safety risk; auditing alternatives (BoringSSL, sandboxing) add complexity
Programmable per-request logic at scaleInternal platforms where every request hits custom routing, auth, transformation logicRust callbacks at every phase with zero overhead versus dynamic config50K-500K RPS with custom logic on every requestOpenResty/Lua hits event-loop ceiling; Envoy WASM has per-call overhead
Ultra-low-latency financial proxyTrading platforms, market data distributionNo GC pauses, predictable Rust async runtime, lock-free hot pathsSub-100us p99 proxy overhead at 100K+ msg/secGo-based proxies (Traefik) have GC pauses; C++ Envoy has unpredictable tail latency from filter chains

Kong

Default for REST API gateways where auth (OIDC, JWT), rate limiting, request transformations, and AI routing are needed from a plugin marketplace rather than built from scratch.

Use CaseCompany / ScenarioDriving PropertyScale DimensionWhy Not Alternative
Multi-tenant REST API gatewaySaaS exposing APIs to external developers with per-tenant rate limits, OAuth, observabilityNative Consumer, Service, Route, Plugin abstractions match the problem domainHundreds to thousands of consumers, hundreds of servicesEnvoy requires building the abstractions yourself; raw Nginx requires Lua scripting for everything
Internal developer platform gatewayPlatform team offering self-service API publishing for internal teamsAdmin API + decK GitOps + plugin marketplace covers common needs out of the box50-500 internal services, ops team of 2-5 engineersEnvoy AI Gateway is the modern alternative but newer; Nginx + custom tooling is more work
OIDC / OAuth2 termination pointEnterprise integrating with Okta, Auth0, Azure AD for API access controlKong Enterprise OIDC plugin handles full OAuth2 flow including refresh, introspection10K+ tokens/sec verified, sub-10ms auth overheadEnvoy ext_authz needs an external auth service; Nginx + Lua requires custom OIDC implementation
Multi-LLM AI gateway (2024-2026 use case)Teams routing across OpenAI, Anthropic, Bedrock, on-prem LLMs with cost controlsAI plugins handle prompt guards, semantic cache, multi-LLM routing, token-aware billing1K-10K LLM requests/sec, multi-provider failoverEnvoy AI Gateway is the direct competitor; choice depends on existing infrastructure
Hybrid mode: control plane in central cluster, data plane at edgeEnterprise with global presence wanting central policy + edge data planeKong hybrid mode separates CP and DP, config flows via mTLSDozens of edge clusters, central control planeEnvoy + custom control plane is more work; Nginx lacks hybrid mode entirely

3. Limitations (multi-tech matrix)

Each cell describes how that limitation manifests in that tech, with severity tagged. Use the severity colors to scan for blockers in your context.

Limitation CategoryNginxHAProxyEnvoyPingoraKong
Dynamic config without restart High OSS requires reload; Nginx Plus has API Medium Runtime API socket + DPAPI works but feels grafted Medium xDS native but requires control plane Medium Programmable but code-level changes need redeploy Medium Admin API native, hybrid mode adds complexity
Memory safety High C-based, structural CVE class risk High C-based, same residual risk High C++17, better than C but not memory-safe Medium Rust eliminates the class; unsafe blocks still possible High C (Nginx) + Lua; carries Nginx's memory risk
HTTP/3 / QUIC maturity Medium Stable but late to land Medium HAProxy 3.0+ has HTTP/3 Medium Production-grade since 2023 Medium On 2025 roadmap, status varies by build Medium Inherits Nginx's HTTP/3 maturity
Service mesh integration High Not designed for mesh; F5 mesh products exist but second-tier High Not a mesh data plane Medium Native mesh data plane (Istio, Consul) Medium Possible but requires custom integration Medium Kong Mesh exists (built on Kuma/Envoy)
Filter / plugin extensibility Medium OpenResty/Lua mature but event-loop bound Medium Lua + SPOE + native filters; less rich than Envoy Medium WASM, ext_proc, ext_authz, Lua, native C++ Medium Rust callbacks; full control but no plugin ecosystem Medium 300+ plugins, mix of OSS and enterprise
HTTP/1.1 lenient parsing for legacy clients Medium Reasonably lenient, decades of accumulated tolerance Medium Strict by default, configurable High Strict RFC compliance can reject legacy clients Medium Designed to handle Cloudflare's "bizarre" HTTP traffic; relatively lenient Medium Inherits Nginx's leniency
Operational governance High F5 ownership, community fork pressure (Freenginx, Angie), ingress-nginx retiring March 2026 Medium HAProxy Technologies stable, predictable Medium CNCF graduated, broad governance Medium Cloudflare-led, Apache 2.0; pre-1.0 API stability not guaranteed Medium Kong Inc. led; open core with significant enterprise lock-in
Per-instance memory at 50K+ idle conns Medium Around 2-4 KB per conn, low Medium Comparable to Nginx High Higher per-conn overhead, around 4-8 KB plus thread overhead Medium Best-in-class for high-concurrency; Cloudflare's whole point Medium Nginx base + Lua VM overhead per worker
Learning curve for new engineers Medium Familiar to most ops engineers; if gotchas catch newcomers Medium ACL syntax has a learning curve but well-documented High xDS, YAML structure, filter chains, control plane: steep High Rust async + Tokio + Pingora traits; very steep Medium Plugin model is approachable; Lua development is niche

4. Fault Tolerance (multi-tech matrix)

Most cells describe how the proxy behaves under partial failure. RTO and RPO conventions are used in HA pair / cluster context.

DimensionNginxHAProxyEnvoyPingoraKong
Process / instance model Master + workers; worker crash respawned by master Single process, multi-threaded since 1.8 Single process, multi-threaded with per-thread isolation Single process, async multi-threaded Rust Nginx master + workers + Lua VM per worker
Failure detection (upstream) Passive on request error; active via ngx_http_upstream_check_module or Plus Active health checks built-in, configurable check intervals, layer 4/7 checks Active + passive (outlier detection), configurable failure thresholds, statistical eviction Programmable via Rust callbacks; whatever you implement Active + passive checks, configurable per upstream
Failover mechanism Mark upstream down on N failures, retry next in pool Backup servers, weighted retry, configurable retry-on conditions Outlier detection ejects host; circuit breaker prevents cascading failures Application-defined via load balancer callback Healthchecks + ring balancer; failed nodes excluded automatically
RTO (upstream failure) 1-3s with health checks tuned; 5-10s with defaults Sub-second with aggressive health check tuning Sub-second with outlier detection Depends on implementation, can be sub-second 1-3s typical
RPO (data loss on proxy failure) In-flight requests lost; clients retry In-flight requests lost; client retry expected Same; ext_proc state requires external store Same; design choice for connection-affinity state In-flight requests lost; rate-limit counters may lose precision
HA pair / cluster topology Typically deployed behind keepalived (VRRP) or cloud LB; no native clustering keepalived VRRP common; HAProxy 2.4+ has limited peer sync Stateless; rely on external LB or DNS for HA Stateless; you build the HA topology Multi-node cluster with shared Postgres or DB-less + config push
Split-brain behavior N/A — single instance per host; cluster is external VRRP-based; can split-brain if network partition isolates LBs Control plane (Istio) handles consistency; data plane is stateless Application-defined; same constraints as a custom service Postgres-backed: standard DB consistency; DB-less: last-write-wins per node until sync
Blast radius of single-instance failure All requests on that instance fail; cloud LB or VRRP shifts traffic Same; classic VRRP failover (1-3s) Same; control plane reroutes via xDS Same; depends on deployment topology Same; cluster reroutes via shared config
Cross-region failover External: GeoDNS, anycast, or upstream LB handles it External: same as Nginx; no native multi-region Native locality-weighted LB, priority-failover within a single cluster Application-defined; can implement any policy Hybrid mode supports multi-region control + local data planes
Data loss scenarios Stats/access logs lost on crash before flush Same; configurable buffer flush intervals Same; admin endpoint stats lost on crash Same; depends on logger implementation Postgres-backed config persists; rate-limit counters in-memory may diverge briefly

5. Sharding / Horizontal Scale (multi-tech matrix)

For proxies, "sharding" means how the deployment scales horizontally (multiple instances, traffic distribution, config partitioning). Some rows are reinterpreted from the canonical database sharding axes.

DimensionNginxHAProxyEnvoyPingoraKong
Horizontal scaling model Stateless replicas behind external LB (cloud LB, DNS round-robin, anycast) Same; VRRP pairs or N+1 behind L4 LB Stateless data plane replicas; control plane handles config Stateless replicas; you build deployment topology Cluster of stateless data planes; shared Postgres or DB-less config sync
Traffic distribution / sharding key External: usually IP-hash, round-robin via cloud LB Source IP hash, header hash, URL hash, leastconn — built-in algorithms Configurable: ring hash, maglev, round-robin, least-request, locality-aware Programmable; whatever load balancing logic you implement Ring balancer with consistent hashing, weighted round-robin
Config partitioning across instances All instances run identical config; partition by deployment Same; identical config across cluster xDS can push partitioned config to subsets of instances Application-defined; can shard config by region or tenant Workspaces (enterprise) allow logical partitioning of config
Adding / removing instances Manual or autoscaling group with health-checked rollout Same; ASG + health checks Stateless; ASG-friendly; control plane handles config distribution Same; stateless instances scale horizontally Add to cluster, config syncs via DB or hybrid push
Hot upstream / hot path behavior Per-worker isolation means one hot upstream can saturate one worker; others stay healthy Threading means hot path is shared; lock contention possible on hot maps Per-thread isolation; hot upstream affects only threads serving it Lock-free hot paths; designed to share connections across threads efficiently Lua VM per worker means hot plugin can saturate a worker; others stay healthy
Practical max instances per cluster No practical limit; cloud LBs manage thousands Same; thousands routinely Istio supports 10K+ sidecars; xDS push storms become a concern past that Depends on deployment; Cloudflare runs at planetary scale Postgres-backed Kong: hundreds of nodes; DB-less or hybrid: thousands
Resharding / config change without downtime Reload spawns new workers; old drain; brief memory spike Hot reload supported; seamless config swap xDS pushes config without restart; truly hot Code change requires redeploy; runtime config can hot-swap Admin API changes propagate; data plane no restart needed
Cross-shard / cross-instance state External: Redis/memcached for rate-limit counters, session affinity peers section for stick-table sync (limited); external for serious state External: ratelimit service via ext_proc, distributed cache for shared state Application-defined; you bring your own state store Redis-backed plugins (rate limit, sessions); Postgres for config

6. State & Config Replication (multi-tech matrix)

For proxies, "replication" maps to two things: (1) configuration replication across data plane instances, (2) runtime state replication (stick tables, rate limit counters, session data).

DimensionNginxHAProxyEnvoyPingoraKong
Config replication topology File-based; GitOps (Ansible, ConfigMap) pushes to all instances Same file-based; orchestration handles distribution xDS control plane fans out to data planes; gRPC streaming Application choice; you build the distribution mechanism Postgres (DB mode) or hybrid mode (CP pushes to DPs via mTLS)
Sync vs async config propagation Async via reload; some seconds between push and effect Same; or socket-based runtime API for sync updates Near-real-time async via xDS streaming; SotW or Delta xDS Async via redeploy or hot reload DB-backed: 5-10s polling; hybrid: gRPC streaming, near-real-time
Replication factor (config copies) One per instance; no shared source of truth in OSS Nginx Same as Nginx One control plane state, fanned out to N data planes Application-defined Postgres = one source of truth; replicated to all DPs
Consistency level (config across instances) Eventually consistent (depends on rollout speed) Same Eventually consistent within seconds; configurable via ADS for strong ordering Application choice Eventually consistent; DB mode has stronger ordering than hybrid push
Config propagation lag (typical) Seconds to minutes (deployment pipeline-bound) Same Sub-second via xDS streaming on healthy mesh Depends on implementation 5-10s DB-backed; sub-second hybrid mode
Runtime state replication (rate limit, sessions) External: Redis, memcached peers / stick-tables for limited sync; external for serious workloads External: ratelimit service, Redis External: you build it Redis-backed plugins; per-node counters in DB-less mode
Conflict resolution on config divergence N/A — single source via GitOps; last-deploy-wins Same Control plane is source of truth; data plane has no conflict Application-defined Postgres-backed: DB version wins; DB-less: file version wins
Multi-region config replication External: Git mirroring, multi-region ConfigMaps Same Cross-region Istio control planes (multi-cluster mesh) supported Application-defined Hybrid mode is purpose-built for this (central CP, regional DPs)
Behavior during control plane partition N/A — file-based, no live control plane N/A — same Data plane continues with last-good config; new config blocked Application-defined; common: last-good behavior Data plane continues; admin API unavailable; new policy changes blocked

7. Better Usage Patterns (per technology)

PE-grade patterns most teams miss until on-call. This is where the artifact earns its keep.

Nginx

Default for static serving combined with reverse proxying on teams that know it well; governance risk from F5 is real — anyone on ingress-nginx in Kubernetes needs a migration plan before March 2026.

PatternWhat Most Teams Do WrongThe Better WayWhy It Matters
Avoid if inside location blocksUse if to do conditional redirects, rewriting; hit Nginx's broken-by-design behaviorUse map, error_page, or restructure to multiple location blocksSilent traffic loss when the if evaluates differently than expected; "if is evil" is official guidance for a reason
Microcaching for hot pathsDisable caching entirely because backend is "dynamic"Enable 1-5 second microcache for backend responses; even tiny TTLs absorb traffic stormsOrder-of-magnitude reduction in backend load during traffic spikes with negligible staleness
Tune worker_connections per ulimit -nLeave default 1024 and wonder why connections drop at scaleSet worker_connections to match system ulimit -n minus headroom, set worker_rlimit_nofile explicitlyWithout this, you cap throughput far below what the box could handle
Separate stats endpoint from public trafficExpose /nginx_status or Prometheus exporter on the same listener as public trafficInternal listener on a different port, locked to internal IPs; never on the public listenerAvoids information leak and stats endpoint becoming a DDoS amplification surface
Reload-safe upstream changesHot-reload during traffic spikes assuming graceful drainUse Nginx Plus dynamic API or consul-template with quiet windows; avoid reload stormsReload spawns new workers and doubles memory transiently; on undersized boxes this is OOM territory
Use proxy_next_upstream conservativelySet to retry on all errors and timeoutsRetry only on specific conditions (connection refused, timeout); avoid retrying 5xx by defaultAggressive retries amplify backend outages instead of mitigating them; classic retry storm anti-pattern

HAProxy

Default for pure load balancing workloads where throughput ceiling and sub-ms tail latency matter more than dynamic routing, plugin extensibility, or service mesh integration.

PatternWhat Most Teams Do WrongThe Better WayWhy It Matters
Use Runtime API instead of reload for dynamic changesReload on every config change, lose in-flight connectionsUse stats socket or Data Plane API to add/drain servers without reloadReload is graceful but not free; for high-churn topologies, Runtime API is the right tool
Aggressive health-check tuning for failoverLeave default 2s intervals, accept slow failoverSub-second intervals (200-500ms) with fall/rise thresholds tuned per backend characteristicsFailover time is dominated by detection time; tuning this is the highest-ROI HAProxy optimization
Stick-tables only for what truly needs themUse stick-tables for analytics, abuse detection, everythingReserve stick-tables for session persistence; push analytics to external systemsStick-tables are in-memory and don't replicate well; misuse leads to memory pressure and data loss on restart
Use ACLs for explicit, scannable routingLong lists of use_backend with embedded conditions, hard to auditDefine named ACLs, then route by ACL combinations; treat ACLs as building blocksRouting logic becomes auditable, testable, and maintainable; reduces "what does this config actually do" cognitive load
Multi-process mode for SSL offload at scaleSingle process for everythingMulti-thread (since 1.8) with thread pinning; or separate HAProxy instances for TLS terminationTLS offload is CPU-bound; isolating it from L7 routing prevents one workload from starving the other
Use observe + on-error for connection-level circuit breakingRely only on health checks for failure detectionAdd observe layer7 error-limit N on-error mark-down per serverActive health checks have detection lag; passive observation catches failures the second they happen

Envoy

Default for service meshes and AI inference gateways where xDS dynamic config, filter chain extensibility, and native OpenTelemetry justify the operational cost of running a control plane.

PatternWhat Most Teams Do WrongThe Better WayWhy It Matters
Use Incremental (Delta) xDS, not SotWStay on default SotW config push for compatibilityMigrate to Delta xDS (default in Istio 1.22+); only changed resources are pushedConfig push storms on large meshes are dramatically reduced; control plane stays responsive under churn
Limit stats cardinality explicitlyAccept default stats sinks with all clusters and routes emitting metricsConfigure stats_config with inclusion list of match patternsStats subsystem can consume 4-6 GB in large deployments; explicit inclusion lists keep it bounded
Pick the right extension model per use caseUse WASM for everything because it's "the modern way"Native filter for perf-critical, ext_proc for stateful and frequently-changing logic, WASM for stable performance-sensitive logicWASM has 20-40% latency overhead per call; using it for hot-path logic is needless cost
Set circuit breakers explicitly per clusterLeave defaults, which are surprisingly permissive (1024 max connections)Tune max_connections, max_pending_requests, max_retries per upstream's actual capacityCircuit breakers prevent cascading failures; defaults will not protect you in a real incident
Outlier detection for fast evictionRely on active health checks aloneConfigure outlier_detection with consecutive_5xx and success_rate thresholdsActive checks have intervals; outlier detection ejects in real-time on observed failures
Lock down the admin endpointExpose admin on 0.0.0.0:9901 because tutorials say soBind admin to 127.0.0.1 only; expose specific endpoints via secure proxy if neededAdmin endpoint exposes config_dump, stats, and full Envoy state; public exposure is a security incident waiting to happen
Use locality-weighted load balancingRound-robin across all upstreams regardless of localityConfigure locality_weighted_lb_config with priority-based failoverCross-AZ or cross-region traffic costs money and adds latency; locality awareness reduces both

Pingora

Default for teams building a custom proxy product in Rust, where the lock-free cross-thread connection pool and memory safety by construction justify the absence of declarative config.

PatternWhat Most Teams Do WrongThe Better WayWhy It Matters
Treat the framework as a library, not a serverTry to find pingora.conf and configure it like NginxEmbrace the Rust callback model; structure your proxy as a service crate with explicit phasesPingora's value is in the programmability; fighting it for declarative config defeats the purpose
Leverage shared connection pool deliberatelyTreat connection pooling as a black boxTune pool sizing per upstream; monitor pool stats to verify reuse is happeningThe lock-free hot pool is Pingora's headline feature; without monitoring, you can't tell if you're getting the value
Use the Service / Server model for multi-tenant proxiesOne monolithic ProxyHttp per use caseMultiple Service instances within one Server, each isolated; share crypto and conn poolResource efficiency improves; failure isolation between tenants becomes natural
Async-aware error handling and timeoutsBubble errors up via ? without phase-aware contextCapture timeouts and errors per phase (upstream_peer, response_filter, etc.) and return structured errors with status codesDebugging async chains is hard; structured per-phase errors are the difference between solvable incidents and 4am pages
Cache upstream selection decisionsRecompute upstream selection on every requestUse upstream_peer with internal state to cache routing decisions per sessionReduces per-request CPU; matters most for high-RPS workloads with stable routing
Monitor Tokio runtime metrics, not just request metricsWatch only request-level p99 latencyAlso watch tokio-console / runtime metrics: worker utilization, task wakeups, blocking timeAsync runtime issues (blocking on a sync call, task starvation) only show up in runtime metrics until they show up in user-visible latency

Kong

Default for REST API gateways where auth (OIDC, JWT), rate limiting, request transformations, and AI routing are needed from a plugin marketplace rather than built from scratch.

PatternWhat Most Teams Do WrongThe Better WayWhy It Matters
Use DB-less mode for new deploymentsDefault to Postgres-backed Kong because tutorials use itUse DB-less with declarative config (decK + Git); reserve DB mode for legacy multi-node needsPostgres is a critical dependency that adds operational surface; DB-less is simpler and well-supported in 2026
Order plugins from light to heavyDefault plugin order, heavy plugins (transformations) before light filters (ACL, IP restriction)Place early-termination plugins (ACL, IP rate limit) first; heavy plugins lastFailed requests should reject fast and cheap; running heavy plugins on requests that get blocked downstream wastes CPU
Profile and limit Lua plugin CPUEnable any plugin that looks useful; assume Lua is "fast enough"Profile plugins under representative load; cap concurrent Lua execution per workerOne CPU-bound Lua plugin can saturate a worker and degrade p99 for unrelated traffic on the same worker
Use hybrid mode for multi-region deploymentsRun separate Kong clusters per region with manual config syncCentral control plane + regional data planes via Kong hybrid mode (mTLS, gRPC config push)Single source of truth, faster config propagation, lower operational burden than independent clusters
Reserve workspaces for true tenant isolationUse workspaces for arbitrary organizational groupingUse workspaces when teams need RBAC + config isolation; use tags for non-isolated groupingWorkspaces add admin API complexity; using them for cosmetic grouping makes everyday admin tasks harder
Plan for the enterprise upgrade decision upfrontBuild on OSS Kong, discover OIDC / advanced rate limit / FIPS requirements later, scramble to upgradeEvaluate the enterprise feature list during initial architecture; budget for it if requirements are likelyReactive enterprise upgrades are more expensive than planned ones; pricing pressure increases under deadline

8. Advanced / Next-Gen Alternatives

Where the field is moving. Maturity badges: Production = ready for adoption, Emerging = watch and pilot, Early = bet only if you're the case study.

Nginx

Default for static serving combined with reverse proxying on teams that know it well; governance risk from F5 is real — anyone on ingress-nginx in Kubernetes needs a migration plan before March 2026.

Successor / AlternativeWhat It ImprovesMaturityMigration CostWhen To Consider
PingoraMemory safety, connection pool reuse across threads, 70% memory + 60% CPU savings at scaleEmergingHigh — Rust rewrite of all custom logic, no config-file pathEdge workloads at planetary scale, especially TLS-heavy; teams with Rust capacity
Angie (fork)Drop-in Nginx replacement with active development from former Nginx core team; faster cadence on features (HTTP/3, ACME, dynamic upstreams)ProductionLow — wire-compatible, drop-in replacement for nginx.confWhen you want Nginx-the-tool without F5 governance; Russian-company ownership is a factor to evaluate
Freenginx (fork)Community-led, lighter feature drift from upstream Nginx; security-policy alignment with original developersEarlyLow — drop-inWhen you want minimal drift but escape F5 governance; smaller community than Angie
F5 NGINX Ingress Controller 5.xDirect migration path from community ingress-nginx (which retires March 2026)ProductionMedium — Ingress API mostly compatible but feature differences existKubernetes clusters running ingress-nginx today; this is the official migration path

HAProxy

Default for pure load balancing workloads where throughput ceiling and sub-ms tail latency matter more than dynamic routing, plugin extensibility, or service mesh integration.

Successor / AlternativeWhat It ImprovesMaturityMigration CostWhen To Consider
EnvoyDynamic xDS config, service mesh integration, native filter chain, observabilityProductionHigh — different config model, control plane requiredWhen dynamic backends and service mesh become primary requirements
PingoraMemory safety, programmability for custom proxy productsEmergingHigh — Rust rewriteBuilding a proxy product rather than deploying a proxy
Kernel-bypass options (DPDK-based, XDP-based)Sub-microsecond L4 forwarding for ultra-low-latency trading and edgeEmergingVery High — requires DPDK or eBPF expertise, NIC and kernel couplingTrading systems and CDN edge where every microsecond counts; specialized hardware and team

Envoy

Default for service meshes and AI inference gateways where xDS dynamic config, filter chain extensibility, and native OpenTelemetry justify the operational cost of running a control plane.

Successor / AlternativeWhat It ImprovesMaturityMigration CostWhen To Consider
linkerd2-proxy (Rust)Memory safety, simpler operational model, lighter resource footprint than Envoy sidecarProductionHigh — Linkerd mesh is a full alternative, not drop-inWhen the Envoy + Istio operational complexity exceeds team capacity; Linkerd is the simpler mesh
Envoy Gateway / Envoy AI GatewayHigher-level abstractions over raw Envoy, Kubernetes Gateway API support, AI-specific features (ext_proc for prompt processing, token-aware rate limiting)EmergingLow — built on Envoy, easier onboardingGreenfield gateways, especially AI inference gateways; this is where the Envoy ecosystem is investing in 2026
PingoraMemory safety, programmability, lower per-instance resource costEmergingVery High — Rust rewrite, no mesh integration storyCustom proxy products at scale; not a service mesh replacement
Cilium Service Mesh (eBPF)Sidecar-less mesh via eBPF, lower overhead per pod, kernel-level enforcementEmergingMedium — Cilium CNI replacement, mesh features layered on topKubernetes clusters where Cilium is already the CNI; mesh-without-sidecar is an attractive simplification

Pingora

Default for teams building a custom proxy product in Rust, where the lock-free cross-thread connection pool and memory safety by construction justify the absence of declarative config.

Successor / AlternativeWhat It ImprovesMaturityMigration CostWhen To Consider
ISRG River (built on Pingora)Higher-level reverse proxy abstraction over Pingora; binary product rather than frameworkEarlyLow — when ready, intended as a Pingora-powered Nginx alternativeWhen you want Pingora's memory safety without writing Rust application code
Rama (Tower-based Rust proxy)Layered middleware model (Tower / tower-http ecosystem); more familiar to Rust web devsEarlyMedium — different API model from Pingora; same Rust foundationWhen you prefer Tower's layered middleware over Pingora's callback model
hyper + custom Rust proxyMaximum flexibility, lowest abstraction; raw HTTP libraryProductionVery High — you build everything Pingora gives you for freeNiche custom requirements where Pingora's framework doesn't fit

Kong

Default for REST API gateways where auth (OIDC, JWT), rate limiting, request transformations, and AI routing are needed from a plugin marketplace rather than built from scratch.

Successor / AlternativeWhat It ImprovesMaturityMigration CostWhen To Consider
Envoy AI GatewayNative ext_proc integration, per-model rate limiting, deeper Envoy ecosystem integration, no Lua event-loop ceilingEmergingMedium — different abstractions; Gateway API for routingAI inference gateway use cases where Envoy is the architectural fit
Apache APISIXetcd-based config (no Postgres), claims 2x Kong throughput in benchmarks, fewer enterprise-locked featuresProductionMedium — similar plugin model, conceptually similarWhen Kong's Postgres dependency or enterprise pricing is a problem; APISIX is the closest direct alternative
TykGo-based, smaller resource footprint, simpler architectureProductionMedium — different config modelWhen you want a lighter API gateway with fewer plugin dependencies
Cloud-managed API gateways (AWS API Gateway, Apigee, Azure APIM)No infrastructure to operate; pay-per-request pricingProductionMedium-High — vendor lock-in, different config and observabilityWhen operational simplicity trumps cost predictability and feature flexibility; works well for low-to-mid traffic gateways