L7 Proxies / Load Balancers / API Gateways
Principal Engineer trade-off analysis across Nginx, HAProxy, Envoy, Pingora, and Kong.
Category Sweep L7 Data Plane API GatewayAs of 2026-06-02
The proxy market split into three lanes by 2026. HAProxy stays the throughput king for stable backend topologies (around 42K RPS per Kubernetes ingress benchmark, lowest p99). Envoy won the service mesh and AI gateway race on the strength of xDS, dynamic reconfiguration without restart, and a healthy filter chain (WASM, ext_proc, Lua). Pingora is the post-Nginx future for organizations that own their proxy code: Cloudflare reports roughly 70% memory and 60% CPU reduction versus Nginx at trillion-request-per-day scale.
Nginx remains dominant by deployment count, but governance pressure is real. The community-maintained ingress-nginx Kubernetes project is being retired in March 2026, and developer-led forks (Freenginx, Angie) emerged after F5 acquisition friction. Treat Nginx as the safe incumbent for greenfield small/medium workloads but assume migration risk on the 3-5 year horizon. Kong is the right choice when you want a pre-built plugin marketplace (auth, rate limit, transformations) and accept the OpenResty/Lua event-loop ceiling under CPU-bound plugins.
The non-obvious call: if you are doing AI inference gateways in 2026, start with Envoy AI Gateway, not Kong. Token-aware rate limiting and per-model priority routing land natively on Envoy ExtProc, and the ecosystem is moving there.
Best default choices
1. Trade-Offs (per technology)
Each row is a deliberate "X for Y" choice. Gains are specific. Costs are operational. PE Nuance is what most engineers miss until on-call.
Nginx
Default for static serving combined with reverse proxying on teams that know it well; governance risk from F5 is real — anyone on ingress-nginx in Kubernetes needs a migration plan before March 2026.
| Trade-Off | What You Gain | What You Give Up | When It Bites You | PE Nuance |
|---|---|---|---|---|
| Static config + reload model over dynamic API | Simple mental model, predictable behavior, file-based GitOps works trivially | No service-discovery-driven endpoint updates without 3rd party (consul-template, nginx-plus API, nginx-mesh) | Microservices fleet with 200+ services churning endpoints every minute. Reload storms cause connection drops and 502s during deploys. | Open-source Nginx reload spawns new workers and drains old ones, but during the drain window memory doubles. At 5K+ worker config, this is enough to OOM on small instances. |
| C-based event loop over thread pool model | Tiny memory per connection (around 2-4 KB), 50K+ idle connections per worker | Single CPU-bound module call blocks the whole worker, kneecapping latency for every other connection on that worker | A Lua script doing JSON parsing on every request. p99 latency triples under load even though average looks fine. | Memory CVEs in C-based proxies are a structural risk, not a tooling problem. The post-2020 push to Rust (Pingora, linkerd2-proxy) is precisely a response to this. |
| Lua plugin model via OpenResty over native modules | Customization without C compilation, large ecosystem (Kong sits on this) | LuaJIT memory ceiling around 2 GB per worker, CPU-bound Lua starves the event loop | Heavy auth plugin under traffic spike. Workers run Lua GC, p99 spikes 10x for the duration. | OpenResty trace flags (lj-arch, lj-state) are mandatory for any serious Lua deployment. Most teams discover this only after the first production incident. |
| Synchronous worker model over async I/O | Deterministic CPU pinning, NUMA-friendly, easy capacity planning | Connection reuse across workers is impossible without shared memory tricks | TLS handshake cost compounds: 4 workers with separate TLS session caches means 4x the handshake load on origin. | This is the specific reason Cloudflare built Pingora. At 1T requests/day, per-worker isolation became dominant overhead. |
| F5 corporate stewardship over community-led project | Commercial support, FIPS builds, dedicated security team, integration with F5 BIG-IP | Strategic direction outside community control (Freenginx fork, Angie fork) | Critical CVE released against a feature you depend on, and disclosure timing was a corporate decision rather than community-aligned. | For 2026+ greenfield projects, this is a real signal. The retirement of ingress-nginx in March 2026 is the canary. Anyone deploying Nginx in k8s should be planning a migration path. |
| Battle-tested over feature-current | 15+ years in production, every edge case documented somewhere on Stack Overflow | HTTP/3 support arrived late and is still flagged experimental in many builds; gRPC support is functional but unloved | Greenfield team wants HTTP/3 + gRPC streaming + WebSocket multiplex in one proxy. Nginx is the wrong starting point. | "Battle-tested" cuts both ways at PE level. The code is stable but conservative. Innovation has migrated to Envoy and Pingora. |
| Single binary over framework | Configurable as a web server, reverse proxy, mail proxy, TCP load balancer, all from one binary | Jack-of-all-trades architecture leaves performance gaps versus specialized proxies | You need pure L4 load balancing at line rate. HAProxy or kernel-bypass options beat Nginx by 30-40%. | The breadth was a 2010-era win when teams had one tool budget. In 2026, separation of concerns (Envoy for mesh, Nginx for web, HAProxy for L4) is the more defensible architecture. |
| Configuration DSL over declarative API | Powerful pattern matching with map/geo/regex, conditional logic via if | Conditional if blocks are notoriously broken in location contexts ("if is evil" is official guidance) | New team member writes a seemingly correct if-based redirect rule, and traffic silently disappears for certain header combinations. | The Nginx if behavior is one of the most-cited examples of a config DSL leaking implementation details to operators. Compare to Envoy's RouteConfiguration which is explicit but verbose. |
HAProxy
Default for pure load balancing workloads where throughput ceiling and sub-ms tail latency matter more than dynamic routing, plugin extensibility, or service mesh integration.
| Trade-Off | What You Gain | What You Give Up | When It Bites You | PE Nuance |
|---|---|---|---|---|
| Specialized load balancer over web server | Best-in-class throughput, around 42K RPS per Kubernetes ingress benchmark, around 50% CPU vs 73% for Envoy at the same load | No static file serving, no caching layer (must front it with Varnish or Nginx if needed) | You wanted "one proxy" that also serves static assets. HAProxy is not that. You end up running HAProxy + Nginx + Varnish. | The narrow scope is the reason for the throughput. At PE scale, this is a feature: do one thing, do it at line rate. |
| Configuration file over dynamic API (with Runtime API as escape hatch) | Predictable, GitOps-friendly, no control plane needed for stable backends | Dynamic endpoint updates require Runtime API socket commands or DPAPI (HAProxy 2.4+) | Service mesh use case with constantly-changing endpoint lists. Possible but feels grafted on, not native. | The Data Plane API gets HAProxy closer to Envoy parity, but the design center is still static config. If you need xDS-grade dynamism, you are fighting the tool. |
| Multi-threaded over single-threaded with workers | True multithreading (since HAProxy 1.8), better core utilization, shared session table | Thread synchronization overhead under heavy lock contention on shared maps | Sticky session table with millions of entries and high churn. Lock contention becomes visible. | HAProxy's threading is more sophisticated than Nginx's worker isolation but less aggressive than Envoy's per-thread isolation. The middle path is good for L4 but compromises L7 customization. |
| Mature TCP/HTTP focus over service mesh feature set | Rock-solid L4 (TCP proxy), unbeatable for non-HTTP protocols (MySQL, PostgreSQL, Redis) | No native gRPC streaming optimization, no service mesh sidecar story, limited service discovery integration | Internal microservices comm needs proxying. HAProxy works but Envoy is the architecturally aligned choice. | HAProxy at the edge plus Envoy in the mesh is a common, defensible PE-level topology. Don't try to make HAProxy do mesh work. |
| Built-in observability (stats page, Prometheus exporter) over distributed tracing | Excellent low-cost metrics out of the box, robust admin socket for live introspection | Distributed tracing integration is grafted on, weaker than Envoy's native OpenTelemetry | You need per-request span context propagation across a 12-hop service path. Envoy gives this natively; HAProxy needs work. | HAProxy 3.2 (Oct 2025) significantly improved tracing, but the design center is still aggregate metrics, not request-level traces. |
| ACL-based routing over per-route filter chains | Highly composable, fast to evaluate, easy to read once you internalize the syntax | Complex per-route policy chains require ACL composition that gets unreadable past 5-6 conditions | Auth policy varies by tenant, region, and feature flag. The ACL config grows to thousands of lines and ownership becomes opaque. | For routing logic that complex, you have outgrown HAProxy and should be on a gateway (Kong, Envoy AI Gateway) where policy is a first-class data model. |
| Open source with commercial HAProxy Enterprise option | Vendor model is rational: enterprise features (WAF, advanced rate limiting, admin UI) cost money, core is fully free | Some operationally-valuable features (multi-cluster sync, dashboard) are paid tier only | You build a free-tier deployment, hit operational friction, and the upgrade path is "buy HAProxy Enterprise" with non-trivial pricing. | Compared to Nginx (F5 acquisition friction) and Envoy (no commercial entity), HAProxy's commercial model is the most predictable. That stability is itself a PE-relevant signal. |
Envoy
Default for service meshes and AI inference gateways where xDS dynamic config, filter chain extensibility, and native OpenTelemetry justify the operational cost of running a control plane.
| Trade-Off | What You Gain | What You Give Up | When It Bites You | PE Nuance |
|---|---|---|---|---|
| xDS dynamic config over static files | Zero-restart endpoint, route, listener, and secret updates from a control plane. The defining feature. | You need a control plane (Istio, Consul, Envoy Gateway, custom). Significant operational surface. | Greenfield team picks Envoy without realizing they also need to operate a control plane. Most of the cost is there, not in Envoy itself. | Incremental xDS (Delta) is now default in Istio 1.22+. If you are on classic SotW (State of the World), config push storms on large meshes are still a real failure mode. |
| C++ over Rust or C | Mature C++17, well-trodden ground for high-performance networking, large contributor pool | Memory-safety class of CVEs still possible, harder to onboard contributors versus Go or Rust | Custom filter authored in C++ has a use-after-free bug, hits production after fuzz testing missed the edge case. | The shift from C++ to Rust at the proxy layer is a multi-year trend (Pingora, linkerd2-proxy). Envoy is unlikely to follow but the gap will matter for security-conscious orgs. |
| Filter chain extensibility over single-purpose proxy | WASM, Lua, ext_proc (gRPC), ext_authz, native C++ filters: any logic at any point | WASM filters have non-trivial perf cost (often 20-40% latency add), ext_proc adds a network hop | Auth via ext_proc adds 5-15ms p99 per request. At 50 RPS per user, that's a noticeable user-facing slowdown. | ext_proc is the right choice for stateful auth where logic changes weekly. WASM is the right choice for stable, perf-sensitive logic. Native C++ filters are the right choice when both apply and you can staff the maintenance. |
| Per-thread isolation over shared worker model | No locks on hot paths, predictable tail latency, easy to reason about thread pinning | Connection pool fragmentation across threads, more memory overhead than Nginx | High keepalive-heavy workload. Each thread maintains its own pool, total memory usage scales with thread count. | Connection pool sharing is one of the specific pain points Pingora was designed around. Envoy has improved (connection_pool_per_downstream_connection options) but it's not as elegant as Pingora's lock-free hot pool. |
| Native observability over add-on metrics | Built-in OpenTelemetry, statsd, Prometheus, admin endpoint with config_dump for diagnostics | High cardinality stats can balloon memory; admin endpoint is a security risk if exposed | Production Envoy with thousands of clusters and per-route stats hits 4-6 GB memory just for the stats subsystem. | Use match-based stats inclusion lists in production, never the default. Most teams find out about stats cardinality cost only after the OOM. |
| Service mesh DNA over standalone proxy | Cleanest fit for mTLS-everywhere, identity-based routing, traffic shifting, fault injection | Standalone use feels heavyweight; you carry mesh complexity even if you only need an LB | Single-team need for a basic reverse proxy ends up running an Envoy + control plane stack worth a quarter of engineering time. | For single-LB use cases, Envoy is overkill. For multi-team microservices with identity, traffic policies, and observability requirements, nothing else competes. |
| HTTP/2, HTTP/3, gRPC first-class over HTTP/1.1 first | Modern protocols natively, gRPC streaming, HTTP/3 QUIC stable | Some HTTP/1.1 edge cases (chunked encoding, trailer handling) get less attention | Legacy HTTP/1.1 client with non-RFC-compliant headers (common in mobile SDKs and old CDN paths) gets rejected or misparsed. | Cloudflare specifically called this out as a reason for Pingora: "bizarre and non-RFC compliant HTTP traffic." If your traffic includes a lot of weird clients, Envoy is stricter than Nginx, which is stricter than Pingora. |
Pingora
Default for teams building a custom proxy product in Rust, where the lock-free cross-thread connection pool and memory safety by construction justify the absence of declarative config.
| Trade-Off | What You Gain | What You Give Up | When It Bites You | PE Nuance |
|---|---|---|---|---|
| Library/framework over binary product | You compose the proxy you want from Rust callbacks; total control over request lifecycle | No drop-in nginx.conf equivalent; every deployment is a custom Rust application | Team wants a quick reverse proxy with a YAML config. Pingora is the wrong choice for that use case. | Pingora competes with libraries (Tower, hyper) more than with binary proxies. The right comparison for "deploy a proxy" is Nginx/Envoy/HAProxy. The right comparison for "build a custom proxy product" is Pingora. |
| Rust memory safety over C/C++ performance ceiling | Eliminates buffer overflow, use-after-free, dangling pointer CVE classes by construction | Steeper learning curve, smaller hiring pool versus C++ or Go | Team needs to extend the proxy and you discover Rust async (Tokio runtime, lifetime juggling) is a real cliff for new engineers. | The hiring constraint is real. If you don't have at least 2 senior Rust engineers, picking Pingora means paying ramp cost in addition to development cost. |
| Async multithreaded over per-process workers | Cloudflare reports about 70% memory and 60% CPU reduction versus their Nginx baseline at trillion-request-per-day scale | Tokio async-Rust has its own learning curve and debugging surface (stack traces are async-aware but uglier) | Race condition in shared state shows up only at high concurrency. Reproducing in test is hard. | The async-Rust ecosystem matured significantly 2024-2026 (tokio-console, tracing). It's no longer the "experimental" tier it was in 2022. |
| Lock-free connection pool shared across threads over per-worker pools | Massively higher connection reuse, fewer TLS handshakes to origin, lower tail latency on warm paths | Implementation complexity is real; lock-free data structures are a source of subtle correctness bugs | Custom transport layer extension introduces a race that triggers only under specific connection-recycling patterns. | This is the specific design choice that makes Pingora superior to Nginx at Cloudflare scale. Lower-scale orgs may not see the same gains; the architecture pays off most where origin TLS cost is dominant. |
| Programmable callbacks over declarative config | Total flexibility: every phase (request, upstream selection, response, logging) is a Rust function you own | No GUI, no YAML, no "ops team configures it" model. Every change is a code change and a deploy. | Non-engineer ops team needs to add a redirect rule. With Nginx they'd edit a file; with Pingora they file a ticket. | For the "platform team builds proxy, app teams consume" model this is a feature, not a bug. For the "self-service via config" model, it's wrong. |
| Cloudflare-led roadmap over CNCF or community governance | Battle-tested at Cloudflare's 40M+ HTTP requests per second; bugs found in real traffic, fixed fast | Pre-1.0 API stability not guaranteed; roadmap optimizes for Cloudflare's needs first | You bet on a specific Pingora API, then Cloudflare changes the trait signature in v0.8 and your downstream code breaks. | ISRG (Internet Security Research Group) is funding adoption and the River project to provide a higher-level abstraction. The ecosystem is maturing but it is not yet ready for "anyone can adopt" status. |
| HTTP/1, HTTP/2, gRPC, WebSocket support over universal protocol | Covers the proxy protocols 95% of services need | HTTP/3 / QUIC was on the 2025 roadmap, status varies by build | Team needs production HTTP/3 today; check current Pingora support before committing. | If HTTP/3 is mandatory, Envoy and Nginx Plus are ahead. If you can wait 6-12 months, Pingora is catching up fast. |
Kong
Default for REST API gateways where auth (OIDC, JWT), rate limiting, request transformations, and AI routing are needed from a plugin marketplace rather than built from scratch.
| Trade-Off | What You Gain | What You Give Up | When It Bites You | PE Nuance |
|---|---|---|---|---|
| API gateway abstraction over raw proxy | First-class concepts (Service, Route, Consumer, Plugin) match how API teams think about gateways | Heavier than Nginx alone; you carry the gateway abstraction even when you only need a reverse proxy | Single-service deploy uses Kong because "it's the standard," carries database + admin API + control plane overhead unnecessarily. | If the answer to "what does this proxy do?" is "route traffic," skip Kong. If it's "auth, rate-limit, transform, observe, then route," Kong starts to pay off. |
| OpenResty / Lua plugin model over native or WASM | Plugins are quick to write, hot-reload friendly, accessible to non-systems engineers | Lua plugins share Nginx's event-loop ceiling; CPU-bound plugins (regex, JWT verify, JSON transform) starve other requests | Heavy auth plugin under traffic spike, p99 latency spikes for the entire gateway, not just authed requests. | Plugin ordering matters; light filtering plugins (ACL, IP restriction) should run before heavy plugins. This is operator knowledge that doesn't show up in docs prominently enough. |
| DB-backed config (Postgres) over DB-less mode | Multi-node config sharing, admin API works across cluster, declarative GitOps via decK | Postgres becomes a critical dependency; outage of Postgres can degrade Kong control plane (data plane usually stays up) | Postgres connection pool exhausted under config-change storm during a deploy, admin API stops responding. | DB-less mode (config from YAML/JSON, no Postgres) is the right default for new deployments. DB-backed mode is legacy and adds operational surface most teams don't need. |
| Open-source core with Kong Enterprise paid features | Free tier is functional for most use cases (basic auth, rate limit, transformations) | The plugins most teams reach for at scale (OIDC, advanced rate limiting, plugin ordering, FIPS) are enterprise-only | Team builds on free tier, hits an OIDC requirement, has to either reimplement or upgrade. Upgrade cost is significant. | Kong's commercial model is more aggressive than HAProxy's. Plan for the enterprise upgrade decision as part of adoption, not as a surprise later. |
| L7 HTTP/REST/WebSocket focus over universal L4 + L7 | Optimized for the dominant use case (REST API gateway) | Native gRPC, TCP, UDP support is limited; for pure L4 work, HAProxy or Envoy is better | Team needs gRPC streaming with bidirectional flow and full HTTP/2 features. Kong gets you 70% there, the rest is a struggle. | For full gRPC service mesh, Envoy is the right tool. Kong is for REST APIs and is unapologetic about it. |
| Plugin marketplace over build-everything-yourself | 300+ plugins covering common needs (JWT, OAuth2, OIDC, Datadog, transformations, Kafka, AI plugins) | Plugins vary in quality; most-needed plugins (advanced rate limit, OIDC) are enterprise; some community plugins are unmaintained | Team picks a community plugin, it has a memory leak, no maintainer to fix it. | Evaluate plugins like dependencies: maintained, tested at your scale, security-reviewed. The "300+ plugins" headline number hides quality variance. |
| AI Gateway features added 2024-2026 over pure REST gateway | Multi-LLM routing, semantic caching, prompt guards, AI-specific rate limiting now in Kong | Envoy AI Gateway is also evolving fast, with deeper integration into the Envoy ext_proc model | Bet on Kong AI plugins, then Envoy AI Gateway ships features Kong doesn't have for 12 months. | In 2026, both products are racing for the AI gateway segment. Pick based on existing infrastructure (Envoy mesh → Envoy AI Gateway; Kong-centric API platform → Kong AI plugins). |
2. Use Cases (per technology)
Driving Property is the single attribute that ruled out alternatives. Why Not Alternative is the specific reason another reasonable choice was rejected.
Nginx
Default for static serving combined with reverse proxying on teams that know it well; governance risk from F5 is real — anyone on ingress-nginx in Kubernetes needs a migration plan before March 2026.
| Use Case | Company / Scenario | Driving Property | Scale Dimension | Why Not Alternative |
|---|---|---|---|---|
| Static site + reverse proxy on a single VM | Mid-size SaaS shipping a marketing site plus API | One binary serves static, TLS-terminates, and proxies to backend | 5-20K RPS per node, single instance | HAProxy doesn't serve static; Envoy needs a control plane; Kong is overkill |
| Ingress controller for legacy Kubernetes deployments | Hundreds of teams on the soon-retiring ingress-nginx project (community version retires March 2026) | Familiar Ingress resource model, mature ecosystem | 10K-50K services across thousands of clusters globally | HAProxy ingress is faster but less ubiquitous; F5 NGINX Ingress is the migration target now that community version is retiring |
| TLS termination + caching layer in front of origin | Mid-size media site with CDN miss path | Built-in HTTP cache, microcaching for hot paths, low memory | 30K RPS sustained, 200 GB local cache | Varnish has a stronger cache model but a separate config language; Nginx wins on operational simplicity |
| Lua-extended platform layer | Internal platform team extending Nginx via OpenResty for custom routing | Lua scripting at every phase, runtime modification of behavior | 50K RPS with custom logic on every request | Envoy WASM has higher overhead per call; Kong adds gateway abstractions you don't need |
| Multi-tenant SaaS L7 routing by Host header | SaaS with thousands of customer subdomains routed to backends | map directive scales to millions of entries with predictable lookup cost | 20K+ tenant domains, single-digit ms p99 routing decisions | HAProxy's ACL-based routing is comparable but less ergonomic; Envoy needs runtime config sync |
HAProxy
Default for pure load balancing workloads where throughput ceiling and sub-ms tail latency matter more than dynamic routing, plugin extensibility, or service mesh integration.
| Use Case | Company / Scenario | Driving Property | Scale Dimension | Why Not Alternative |
|---|---|---|---|---|
| L4 database load balancer (MySQL, PostgreSQL, Redis) | Fintech routing reads across replicas, writes to primary | TCP proxy at line rate with health-check awareness | 50K connections, sub-ms proxy overhead | Nginx stream module works but lacks the same health-check granularity; Envoy's L4 story is functional but secondary |
| High-throughput Kubernetes ingress | Workloads where ingress is a bottleneck | 42K RPS per benchmark, lowest p99 across competitors | 10K+ pods, hundreds of services per cluster | Nginx ingress retiring March 2026; Envoy uses more CPU at the same throughput |
| Active-passive load balancer for legacy enterprise apps | Banks, telcos with stateful TCP services in front of WebSphere or similar | Mature health checks, session persistence, predictable failover | Hundreds of backends per LB, multi-decade uptime expectations | F5 hardware LB costs more; Nginx requires Plus tier for advanced health checks |
| Edge LB for trading platforms | Low-latency financial venues (named: Deutsche Bank type deployments) | Lowest tail latency under heavy concurrency, predictable jitter | Sub-ms p99 at 50K+ concurrent connections | Envoy's filter chain adds latency; Nginx tail latency degrades faster under stress |
| SSL/TLS terminator front end with WAF integration | Compliance-heavy industry (PCI, HIPAA) needing rigorous TLS termination | FIPS-validated builds (HAProxy Enterprise), tight observability of TLS handshake metrics | 5-20K TLS handshakes/sec with full mTLS | Nginx Plus is comparable but more expensive; OpenResty-based Kong adds Lua overhead on the TLS hot path |
Envoy
Default for service meshes and AI inference gateways where xDS dynamic config, filter chain extensibility, and native OpenTelemetry justify the operational cost of running a control plane.
| Use Case | Company / Scenario | Driving Property | Scale Dimension | Why Not Alternative |
|---|---|---|---|---|
| Service mesh data plane | Istio, Consul Service Mesh, AWS App Mesh users | xDS dynamic config, mTLS, identity-based routing, traffic shifting | 10K+ sidecar instances per mesh, sub-second config propagation | Linkerd2-proxy is leaner but less feature-rich; Nginx mesh products are second-tier |
| API gateway with dynamic routing | Cloud-native companies (Lyft was the origin), platform teams building self-service APIs | Filter chain extensibility (WASM, ext_proc), HTTP/3, gRPC streaming | 100K RPS, hundreds of upstream clusters, route changes per minute | Kong is more abstraction-heavy but easier to onboard; HAProxy lacks dynamic config story |
| AI inference gateway | Teams routing across LLM providers (OpenAI, Anthropic, internal models) | Per-model rate limiting, token-aware quotas, ExtProc for prompt processing, semantic routing | Tens of thousands of LLM requests/sec, multi-second response streaming | Kong AI plugins are catching up; native ext_proc integration in Envoy AI Gateway is currently ahead |
| Multi-region edge gateway | Global SaaS routing to nearest region with health-aware failover | Locality-aware load balancing, outlier detection, circuit breakers built-in | 50+ regions, p99 routing decision under 1ms | HAProxy lacks built-in locality awareness; Nginx requires plugins/scripting |
| Observability-heavy enterprise gateway | SREs needing per-request tracing with OpenTelemetry to Tempo or Jaeger | Native OpenTelemetry exporters, distributed tracing built into the request lifecycle | 10K services with full trace propagation, 1B+ spans/day | HAProxy tracing is newer and less feature-complete; Nginx tracing requires plugins |
Pingora
Default for teams building a custom proxy product in Rust, where the lock-free cross-thread connection pool and memory safety by construction justify the absence of declarative config.
| Use Case | Company / Scenario | Driving Property | Scale Dimension | Why Not Alternative |
|---|---|---|---|---|
| Hyperscale edge proxy | Cloudflare's own edge network | Lock-free connection pool across threads, 70% memory and 60% CPU reduction vs Nginx | 40M+ HTTP req/sec globally, 1T+ requests/day | Nginx hit per-worker isolation ceiling; Envoy is more memory-heavy at the same throughput |
| Custom proxy product | CDN companies, security vendors building proxy-as-a-product | Rust library/framework model gives full control over request lifecycle | Varies by product; typically 10K-1M+ RPS per node | Building on Nginx or Envoy means fighting their config models; Pingora is designed to be embedded |
| Memory-safety-critical edge | Security vendors, government infrastructure, compliance-heavy industries | Rust eliminates entire classes of memory-safety CVEs by construction | Comparable to Nginx workloads with stronger CVE posture | C-based proxies carry residual memory-safety risk; auditing alternatives (BoringSSL, sandboxing) add complexity |
| Programmable per-request logic at scale | Internal platforms where every request hits custom routing, auth, transformation logic | Rust callbacks at every phase with zero overhead versus dynamic config | 50K-500K RPS with custom logic on every request | OpenResty/Lua hits event-loop ceiling; Envoy WASM has per-call overhead |
| Ultra-low-latency financial proxy | Trading platforms, market data distribution | No GC pauses, predictable Rust async runtime, lock-free hot paths | Sub-100us p99 proxy overhead at 100K+ msg/sec | Go-based proxies (Traefik) have GC pauses; C++ Envoy has unpredictable tail latency from filter chains |
Kong
Default for REST API gateways where auth (OIDC, JWT), rate limiting, request transformations, and AI routing are needed from a plugin marketplace rather than built from scratch.
| Use Case | Company / Scenario | Driving Property | Scale Dimension | Why Not Alternative |
|---|---|---|---|---|
| Multi-tenant REST API gateway | SaaS exposing APIs to external developers with per-tenant rate limits, OAuth, observability | Native Consumer, Service, Route, Plugin abstractions match the problem domain | Hundreds to thousands of consumers, hundreds of services | Envoy requires building the abstractions yourself; raw Nginx requires Lua scripting for everything |
| Internal developer platform gateway | Platform team offering self-service API publishing for internal teams | Admin API + decK GitOps + plugin marketplace covers common needs out of the box | 50-500 internal services, ops team of 2-5 engineers | Envoy AI Gateway is the modern alternative but newer; Nginx + custom tooling is more work |
| OIDC / OAuth2 termination point | Enterprise integrating with Okta, Auth0, Azure AD for API access control | Kong Enterprise OIDC plugin handles full OAuth2 flow including refresh, introspection | 10K+ tokens/sec verified, sub-10ms auth overhead | Envoy ext_authz needs an external auth service; Nginx + Lua requires custom OIDC implementation |
| Multi-LLM AI gateway (2024-2026 use case) | Teams routing across OpenAI, Anthropic, Bedrock, on-prem LLMs with cost controls | AI plugins handle prompt guards, semantic cache, multi-LLM routing, token-aware billing | 1K-10K LLM requests/sec, multi-provider failover | Envoy AI Gateway is the direct competitor; choice depends on existing infrastructure |
| Hybrid mode: control plane in central cluster, data plane at edge | Enterprise with global presence wanting central policy + edge data plane | Kong hybrid mode separates CP and DP, config flows via mTLS | Dozens of edge clusters, central control plane | Envoy + custom control plane is more work; Nginx lacks hybrid mode entirely |
3. Limitations (multi-tech matrix)
Each cell describes how that limitation manifests in that tech, with severity tagged. Use the severity colors to scan for blockers in your context.
| Limitation Category | Nginx | HAProxy | Envoy | Pingora | Kong |
|---|---|---|---|---|---|
| Dynamic config without restart | High OSS requires reload; Nginx Plus has API | Medium Runtime API socket + DPAPI works but feels grafted | Medium xDS native but requires control plane | Medium Programmable but code-level changes need redeploy | Medium Admin API native, hybrid mode adds complexity |
| Memory safety | High C-based, structural CVE class risk | High C-based, same residual risk | High C++17, better than C but not memory-safe | Medium Rust eliminates the class; unsafe blocks still possible |
High C (Nginx) + Lua; carries Nginx's memory risk |
| HTTP/3 / QUIC maturity | Medium Stable but late to land | Medium HAProxy 3.0+ has HTTP/3 | Medium Production-grade since 2023 | Medium On 2025 roadmap, status varies by build | Medium Inherits Nginx's HTTP/3 maturity |
| Service mesh integration | High Not designed for mesh; F5 mesh products exist but second-tier | High Not a mesh data plane | Medium Native mesh data plane (Istio, Consul) | Medium Possible but requires custom integration | Medium Kong Mesh exists (built on Kuma/Envoy) |
| Filter / plugin extensibility | Medium OpenResty/Lua mature but event-loop bound | Medium Lua + SPOE + native filters; less rich than Envoy | Medium WASM, ext_proc, ext_authz, Lua, native C++ | Medium Rust callbacks; full control but no plugin ecosystem | Medium 300+ plugins, mix of OSS and enterprise |
| HTTP/1.1 lenient parsing for legacy clients | Medium Reasonably lenient, decades of accumulated tolerance | Medium Strict by default, configurable | High Strict RFC compliance can reject legacy clients | Medium Designed to handle Cloudflare's "bizarre" HTTP traffic; relatively lenient | Medium Inherits Nginx's leniency |
| Operational governance | High F5 ownership, community fork pressure (Freenginx, Angie), ingress-nginx retiring March 2026 | Medium HAProxy Technologies stable, predictable | Medium CNCF graduated, broad governance | Medium Cloudflare-led, Apache 2.0; pre-1.0 API stability not guaranteed | Medium Kong Inc. led; open core with significant enterprise lock-in |
| Per-instance memory at 50K+ idle conns | Medium Around 2-4 KB per conn, low | Medium Comparable to Nginx | High Higher per-conn overhead, around 4-8 KB plus thread overhead | Medium Best-in-class for high-concurrency; Cloudflare's whole point | Medium Nginx base + Lua VM overhead per worker |
| Learning curve for new engineers | Medium Familiar to most ops engineers; if gotchas catch newcomers |
Medium ACL syntax has a learning curve but well-documented | High xDS, YAML structure, filter chains, control plane: steep | High Rust async + Tokio + Pingora traits; very steep | Medium Plugin model is approachable; Lua development is niche |
4. Fault Tolerance (multi-tech matrix)
Most cells describe how the proxy behaves under partial failure. RTO and RPO conventions are used in HA pair / cluster context.
| Dimension | Nginx | HAProxy | Envoy | Pingora | Kong |
|---|---|---|---|---|---|
| Process / instance model | Master + workers; worker crash respawned by master | Single process, multi-threaded since 1.8 | Single process, multi-threaded with per-thread isolation | Single process, async multi-threaded Rust | Nginx master + workers + Lua VM per worker |
| Failure detection (upstream) | Passive on request error; active via ngx_http_upstream_check_module or Plus | Active health checks built-in, configurable check intervals, layer 4/7 checks | Active + passive (outlier detection), configurable failure thresholds, statistical eviction | Programmable via Rust callbacks; whatever you implement | Active + passive checks, configurable per upstream |
| Failover mechanism | Mark upstream down on N failures, retry next in pool | Backup servers, weighted retry, configurable retry-on conditions | Outlier detection ejects host; circuit breaker prevents cascading failures | Application-defined via load balancer callback | Healthchecks + ring balancer; failed nodes excluded automatically |
| RTO (upstream failure) | 1-3s with health checks tuned; 5-10s with defaults | Sub-second with aggressive health check tuning | Sub-second with outlier detection | Depends on implementation, can be sub-second | 1-3s typical |
| RPO (data loss on proxy failure) | In-flight requests lost; clients retry | In-flight requests lost; client retry expected | Same; ext_proc state requires external store | Same; design choice for connection-affinity state | In-flight requests lost; rate-limit counters may lose precision |
| HA pair / cluster topology | Typically deployed behind keepalived (VRRP) or cloud LB; no native clustering | keepalived VRRP common; HAProxy 2.4+ has limited peer sync | Stateless; rely on external LB or DNS for HA | Stateless; you build the HA topology | Multi-node cluster with shared Postgres or DB-less + config push |
| Split-brain behavior | N/A — single instance per host; cluster is external | VRRP-based; can split-brain if network partition isolates LBs | Control plane (Istio) handles consistency; data plane is stateless | Application-defined; same constraints as a custom service | Postgres-backed: standard DB consistency; DB-less: last-write-wins per node until sync |
| Blast radius of single-instance failure | All requests on that instance fail; cloud LB or VRRP shifts traffic | Same; classic VRRP failover (1-3s) | Same; control plane reroutes via xDS | Same; depends on deployment topology | Same; cluster reroutes via shared config |
| Cross-region failover | External: GeoDNS, anycast, or upstream LB handles it | External: same as Nginx; no native multi-region | Native locality-weighted LB, priority-failover within a single cluster | Application-defined; can implement any policy | Hybrid mode supports multi-region control + local data planes |
| Data loss scenarios | Stats/access logs lost on crash before flush | Same; configurable buffer flush intervals | Same; admin endpoint stats lost on crash | Same; depends on logger implementation | Postgres-backed config persists; rate-limit counters in-memory may diverge briefly |
6. State & Config Replication (multi-tech matrix)
For proxies, "replication" maps to two things: (1) configuration replication across data plane instances, (2) runtime state replication (stick tables, rate limit counters, session data).
| Dimension | Nginx | HAProxy | Envoy | Pingora | Kong |
|---|---|---|---|---|---|
| Config replication topology | File-based; GitOps (Ansible, ConfigMap) pushes to all instances | Same file-based; orchestration handles distribution | xDS control plane fans out to data planes; gRPC streaming | Application choice; you build the distribution mechanism | Postgres (DB mode) or hybrid mode (CP pushes to DPs via mTLS) |
| Sync vs async config propagation | Async via reload; some seconds between push and effect | Same; or socket-based runtime API for sync updates | Near-real-time async via xDS streaming; SotW or Delta xDS | Async via redeploy or hot reload | DB-backed: 5-10s polling; hybrid: gRPC streaming, near-real-time |
| Replication factor (config copies) | One per instance; no shared source of truth in OSS Nginx | Same as Nginx | One control plane state, fanned out to N data planes | Application-defined | Postgres = one source of truth; replicated to all DPs |
| Consistency level (config across instances) | Eventually consistent (depends on rollout speed) | Same | Eventually consistent within seconds; configurable via ADS for strong ordering | Application choice | Eventually consistent; DB mode has stronger ordering than hybrid push |
| Config propagation lag (typical) | Seconds to minutes (deployment pipeline-bound) | Same | Sub-second via xDS streaming on healthy mesh | Depends on implementation | 5-10s DB-backed; sub-second hybrid mode |
| Runtime state replication (rate limit, sessions) | External: Redis, memcached | peers / stick-tables for limited sync; external for serious workloads | External: ratelimit service, Redis | External: you build it | Redis-backed plugins; per-node counters in DB-less mode |
| Conflict resolution on config divergence | N/A — single source via GitOps; last-deploy-wins | Same | Control plane is source of truth; data plane has no conflict | Application-defined | Postgres-backed: DB version wins; DB-less: file version wins |
| Multi-region config replication | External: Git mirroring, multi-region ConfigMaps | Same | Cross-region Istio control planes (multi-cluster mesh) supported | Application-defined | Hybrid mode is purpose-built for this (central CP, regional DPs) |
| Behavior during control plane partition | N/A — file-based, no live control plane | N/A — same | Data plane continues with last-good config; new config blocked | Application-defined; common: last-good behavior | Data plane continues; admin API unavailable; new policy changes blocked |
7. Better Usage Patterns (per technology)
PE-grade patterns most teams miss until on-call. This is where the artifact earns its keep.
Nginx
Default for static serving combined with reverse proxying on teams that know it well; governance risk from F5 is real — anyone on ingress-nginx in Kubernetes needs a migration plan before March 2026.
| Pattern | What Most Teams Do Wrong | The Better Way | Why It Matters |
|---|---|---|---|
Avoid if inside location blocks | Use if to do conditional redirects, rewriting; hit Nginx's broken-by-design behavior | Use map, error_page, or restructure to multiple location blocks | Silent traffic loss when the if evaluates differently than expected; "if is evil" is official guidance for a reason |
| Microcaching for hot paths | Disable caching entirely because backend is "dynamic" | Enable 1-5 second microcache for backend responses; even tiny TTLs absorb traffic storms | Order-of-magnitude reduction in backend load during traffic spikes with negligible staleness |
Tune worker_connections per ulimit -n | Leave default 1024 and wonder why connections drop at scale | Set worker_connections to match system ulimit -n minus headroom, set worker_rlimit_nofile explicitly | Without this, you cap throughput far below what the box could handle |
| Separate stats endpoint from public traffic | Expose /nginx_status or Prometheus exporter on the same listener as public traffic | Internal listener on a different port, locked to internal IPs; never on the public listener | Avoids information leak and stats endpoint becoming a DDoS amplification surface |
| Reload-safe upstream changes | Hot-reload during traffic spikes assuming graceful drain | Use Nginx Plus dynamic API or consul-template with quiet windows; avoid reload storms | Reload spawns new workers and doubles memory transiently; on undersized boxes this is OOM territory |
Use proxy_next_upstream conservatively | Set to retry on all errors and timeouts | Retry only on specific conditions (connection refused, timeout); avoid retrying 5xx by default | Aggressive retries amplify backend outages instead of mitigating them; classic retry storm anti-pattern |
HAProxy
Default for pure load balancing workloads where throughput ceiling and sub-ms tail latency matter more than dynamic routing, plugin extensibility, or service mesh integration.
| Pattern | What Most Teams Do Wrong | The Better Way | Why It Matters |
|---|---|---|---|
| Use Runtime API instead of reload for dynamic changes | Reload on every config change, lose in-flight connections | Use stats socket or Data Plane API to add/drain servers without reload | Reload is graceful but not free; for high-churn topologies, Runtime API is the right tool |
| Aggressive health-check tuning for failover | Leave default 2s intervals, accept slow failover | Sub-second intervals (200-500ms) with fall/rise thresholds tuned per backend characteristics | Failover time is dominated by detection time; tuning this is the highest-ROI HAProxy optimization |
| Stick-tables only for what truly needs them | Use stick-tables for analytics, abuse detection, everything | Reserve stick-tables for session persistence; push analytics to external systems | Stick-tables are in-memory and don't replicate well; misuse leads to memory pressure and data loss on restart |
| Use ACLs for explicit, scannable routing | Long lists of use_backend with embedded conditions, hard to audit | Define named ACLs, then route by ACL combinations; treat ACLs as building blocks | Routing logic becomes auditable, testable, and maintainable; reduces "what does this config actually do" cognitive load |
| Multi-process mode for SSL offload at scale | Single process for everything | Multi-thread (since 1.8) with thread pinning; or separate HAProxy instances for TLS termination | TLS offload is CPU-bound; isolating it from L7 routing prevents one workload from starving the other |
| Use observe + on-error for connection-level circuit breaking | Rely only on health checks for failure detection | Add observe layer7 error-limit N on-error mark-down per server | Active health checks have detection lag; passive observation catches failures the second they happen |
Envoy
Default for service meshes and AI inference gateways where xDS dynamic config, filter chain extensibility, and native OpenTelemetry justify the operational cost of running a control plane.
| Pattern | What Most Teams Do Wrong | The Better Way | Why It Matters |
|---|---|---|---|
| Use Incremental (Delta) xDS, not SotW | Stay on default SotW config push for compatibility | Migrate to Delta xDS (default in Istio 1.22+); only changed resources are pushed | Config push storms on large meshes are dramatically reduced; control plane stays responsive under churn |
| Limit stats cardinality explicitly | Accept default stats sinks with all clusters and routes emitting metrics | Configure stats_config with inclusion list of match patterns | Stats subsystem can consume 4-6 GB in large deployments; explicit inclusion lists keep it bounded |
| Pick the right extension model per use case | Use WASM for everything because it's "the modern way" | Native filter for perf-critical, ext_proc for stateful and frequently-changing logic, WASM for stable performance-sensitive logic | WASM has 20-40% latency overhead per call; using it for hot-path logic is needless cost |
| Set circuit breakers explicitly per cluster | Leave defaults, which are surprisingly permissive (1024 max connections) | Tune max_connections, max_pending_requests, max_retries per upstream's actual capacity | Circuit breakers prevent cascading failures; defaults will not protect you in a real incident |
| Outlier detection for fast eviction | Rely on active health checks alone | Configure outlier_detection with consecutive_5xx and success_rate thresholds | Active checks have intervals; outlier detection ejects in real-time on observed failures |
| Lock down the admin endpoint | Expose admin on 0.0.0.0:9901 because tutorials say so | Bind admin to 127.0.0.1 only; expose specific endpoints via secure proxy if needed | Admin endpoint exposes config_dump, stats, and full Envoy state; public exposure is a security incident waiting to happen |
| Use locality-weighted load balancing | Round-robin across all upstreams regardless of locality | Configure locality_weighted_lb_config with priority-based failover | Cross-AZ or cross-region traffic costs money and adds latency; locality awareness reduces both |
Pingora
Default for teams building a custom proxy product in Rust, where the lock-free cross-thread connection pool and memory safety by construction justify the absence of declarative config.
| Pattern | What Most Teams Do Wrong | The Better Way | Why It Matters |
|---|---|---|---|
| Treat the framework as a library, not a server | Try to find pingora.conf and configure it like Nginx | Embrace the Rust callback model; structure your proxy as a service crate with explicit phases | Pingora's value is in the programmability; fighting it for declarative config defeats the purpose |
| Leverage shared connection pool deliberately | Treat connection pooling as a black box | Tune pool sizing per upstream; monitor pool stats to verify reuse is happening | The lock-free hot pool is Pingora's headline feature; without monitoring, you can't tell if you're getting the value |
| Use the Service / Server model for multi-tenant proxies | One monolithic ProxyHttp per use case | Multiple Service instances within one Server, each isolated; share crypto and conn pool | Resource efficiency improves; failure isolation between tenants becomes natural |
| Async-aware error handling and timeouts | Bubble errors up via ? without phase-aware context | Capture timeouts and errors per phase (upstream_peer, response_filter, etc.) and return structured errors with status codes | Debugging async chains is hard; structured per-phase errors are the difference between solvable incidents and 4am pages |
| Cache upstream selection decisions | Recompute upstream selection on every request | Use upstream_peer with internal state to cache routing decisions per session | Reduces per-request CPU; matters most for high-RPS workloads with stable routing |
| Monitor Tokio runtime metrics, not just request metrics | Watch only request-level p99 latency | Also watch tokio-console / runtime metrics: worker utilization, task wakeups, blocking time | Async runtime issues (blocking on a sync call, task starvation) only show up in runtime metrics until they show up in user-visible latency |
Kong
Default for REST API gateways where auth (OIDC, JWT), rate limiting, request transformations, and AI routing are needed from a plugin marketplace rather than built from scratch.
| Pattern | What Most Teams Do Wrong | The Better Way | Why It Matters |
|---|---|---|---|
| Use DB-less mode for new deployments | Default to Postgres-backed Kong because tutorials use it | Use DB-less with declarative config (decK + Git); reserve DB mode for legacy multi-node needs | Postgres is a critical dependency that adds operational surface; DB-less is simpler and well-supported in 2026 |
| Order plugins from light to heavy | Default plugin order, heavy plugins (transformations) before light filters (ACL, IP restriction) | Place early-termination plugins (ACL, IP rate limit) first; heavy plugins last | Failed requests should reject fast and cheap; running heavy plugins on requests that get blocked downstream wastes CPU |
| Profile and limit Lua plugin CPU | Enable any plugin that looks useful; assume Lua is "fast enough" | Profile plugins under representative load; cap concurrent Lua execution per worker | One CPU-bound Lua plugin can saturate a worker and degrade p99 for unrelated traffic on the same worker |
| Use hybrid mode for multi-region deployments | Run separate Kong clusters per region with manual config sync | Central control plane + regional data planes via Kong hybrid mode (mTLS, gRPC config push) | Single source of truth, faster config propagation, lower operational burden than independent clusters |
| Reserve workspaces for true tenant isolation | Use workspaces for arbitrary organizational grouping | Use workspaces when teams need RBAC + config isolation; use tags for non-isolated grouping | Workspaces add admin API complexity; using them for cosmetic grouping makes everyday admin tasks harder |
| Plan for the enterprise upgrade decision upfront | Build on OSS Kong, discover OIDC / advanced rate limit / FIPS requirements later, scramble to upgrade | Evaluate the enterprise feature list during initial architecture; budget for it if requirements are likely | Reactive enterprise upgrades are more expensive than planned ones; pricing pressure increases under deadline |
8. Advanced / Next-Gen Alternatives
Where the field is moving. Maturity badges: Production = ready for adoption, Emerging = watch and pilot, Early = bet only if you're the case study.
Nginx
Default for static serving combined with reverse proxying on teams that know it well; governance risk from F5 is real — anyone on ingress-nginx in Kubernetes needs a migration plan before March 2026.
| Successor / Alternative | What It Improves | Maturity | Migration Cost | When To Consider |
|---|---|---|---|---|
| Pingora | Memory safety, connection pool reuse across threads, 70% memory + 60% CPU savings at scale | Emerging | High — Rust rewrite of all custom logic, no config-file path | Edge workloads at planetary scale, especially TLS-heavy; teams with Rust capacity |
| Angie (fork) | Drop-in Nginx replacement with active development from former Nginx core team; faster cadence on features (HTTP/3, ACME, dynamic upstreams) | Production | Low — wire-compatible, drop-in replacement for nginx.conf | When you want Nginx-the-tool without F5 governance; Russian-company ownership is a factor to evaluate |
| Freenginx (fork) | Community-led, lighter feature drift from upstream Nginx; security-policy alignment with original developers | Early | Low — drop-in | When you want minimal drift but escape F5 governance; smaller community than Angie |
| F5 NGINX Ingress Controller 5.x | Direct migration path from community ingress-nginx (which retires March 2026) | Production | Medium — Ingress API mostly compatible but feature differences exist | Kubernetes clusters running ingress-nginx today; this is the official migration path |
HAProxy
Default for pure load balancing workloads where throughput ceiling and sub-ms tail latency matter more than dynamic routing, plugin extensibility, or service mesh integration.
| Successor / Alternative | What It Improves | Maturity | Migration Cost | When To Consider |
|---|---|---|---|---|
| Envoy | Dynamic xDS config, service mesh integration, native filter chain, observability | Production | High — different config model, control plane required | When dynamic backends and service mesh become primary requirements |
| Pingora | Memory safety, programmability for custom proxy products | Emerging | High — Rust rewrite | Building a proxy product rather than deploying a proxy |
| Kernel-bypass options (DPDK-based, XDP-based) | Sub-microsecond L4 forwarding for ultra-low-latency trading and edge | Emerging | Very High — requires DPDK or eBPF expertise, NIC and kernel coupling | Trading systems and CDN edge where every microsecond counts; specialized hardware and team |
Envoy
Default for service meshes and AI inference gateways where xDS dynamic config, filter chain extensibility, and native OpenTelemetry justify the operational cost of running a control plane.
| Successor / Alternative | What It Improves | Maturity | Migration Cost | When To Consider |
|---|---|---|---|---|
| linkerd2-proxy (Rust) | Memory safety, simpler operational model, lighter resource footprint than Envoy sidecar | Production | High — Linkerd mesh is a full alternative, not drop-in | When the Envoy + Istio operational complexity exceeds team capacity; Linkerd is the simpler mesh |
| Envoy Gateway / Envoy AI Gateway | Higher-level abstractions over raw Envoy, Kubernetes Gateway API support, AI-specific features (ext_proc for prompt processing, token-aware rate limiting) | Emerging | Low — built on Envoy, easier onboarding | Greenfield gateways, especially AI inference gateways; this is where the Envoy ecosystem is investing in 2026 |
| Pingora | Memory safety, programmability, lower per-instance resource cost | Emerging | Very High — Rust rewrite, no mesh integration story | Custom proxy products at scale; not a service mesh replacement |
| Cilium Service Mesh (eBPF) | Sidecar-less mesh via eBPF, lower overhead per pod, kernel-level enforcement | Emerging | Medium — Cilium CNI replacement, mesh features layered on top | Kubernetes clusters where Cilium is already the CNI; mesh-without-sidecar is an attractive simplification |
Pingora
Default for teams building a custom proxy product in Rust, where the lock-free cross-thread connection pool and memory safety by construction justify the absence of declarative config.
| Successor / Alternative | What It Improves | Maturity | Migration Cost | When To Consider |
|---|---|---|---|---|
| ISRG River (built on Pingora) | Higher-level reverse proxy abstraction over Pingora; binary product rather than framework | Early | Low — when ready, intended as a Pingora-powered Nginx alternative | When you want Pingora's memory safety without writing Rust application code |
| Rama (Tower-based Rust proxy) | Layered middleware model (Tower / tower-http ecosystem); more familiar to Rust web devs | Early | Medium — different API model from Pingora; same Rust foundation | When you prefer Tower's layered middleware over Pingora's callback model |
| hyper + custom Rust proxy | Maximum flexibility, lowest abstraction; raw HTTP library | Production | Very High — you build everything Pingora gives you for free | Niche custom requirements where Pingora's framework doesn't fit |
Kong
Default for REST API gateways where auth (OIDC, JWT), rate limiting, request transformations, and AI routing are needed from a plugin marketplace rather than built from scratch.
| Successor / Alternative | What It Improves | Maturity | Migration Cost | When To Consider |
|---|---|---|---|---|
| Envoy AI Gateway | Native ext_proc integration, per-model rate limiting, deeper Envoy ecosystem integration, no Lua event-loop ceiling | Emerging | Medium — different abstractions; Gateway API for routing | AI inference gateway use cases where Envoy is the architectural fit |
| Apache APISIX | etcd-based config (no Postgres), claims 2x Kong throughput in benchmarks, fewer enterprise-locked features | Production | Medium — similar plugin model, conceptually similar | When Kong's Postgres dependency or enterprise pricing is a problem; APISIX is the closest direct alternative |
| Tyk | Go-based, smaller resource footprint, simpler architecture | Production | Medium — different config model | When you want a lighter API gateway with fewer plugin dependencies |
| Cloud-managed API gateways (AWS API Gateway, Apigee, Azure APIM) | No infrastructure to operate; pay-per-request pricing | Production | Medium-High — vendor lock-in, different config and observability | When operational simplicity trumps cost predictability and feature flexibility; works well for low-to-mid traffic gateways |