NGINX — The Front Door
An app server that talks directly to the internet is an app server with too many jobs. NGINX takes the dirty work — TLS, static files, gzip, slow clients, rate limits, websocket upgrades, request buffering — and lets your app focus on its actual logic. Master server blocks, location matching, upstream pools, proxy_pass, caching, and the small set of tuning knobs that turn 'works on my machine' into 'serves 50k requests per second on a t3.small.'
What you will learn
You can technically point DNS at a Node app on port 80 and call it production. People do this every day, and they spend the next month learning why nobody does it on purpose. The app server is good at running your code; it is bad — and not its fault — at TLS termination, slow client buffering, gzipping responses, serving 80 MB of static images, holding 10,000 idle keepalive connections, and rate-limiting a botnet. NGINX is the small, fast, c-language process whose entire job is being good at those things, so your app can do the one thing only it can do: run your code.
Why a Reverse Proxy
A reverse proxy is just a server that takes inbound requests and forwards them to other servers, returning their responses. From the client's point of view it is the service. From the backend's point of view, the proxy is the only client it ever speaks to. Putting NGINX in front of your app — even for a single-box deployment — buys you a stack of features for almost no operational cost.
What you get for free by adding NGINX
- TLS termination in a battle-tested implementation, with sane defaults you don't have to maintain in your app.
- Slow-client buffering. A user on a 2G connection takes seconds to send a POST body. NGINX buffers the request and hands the complete one to your app — your app's worker is busy for milliseconds, not seconds.
- Static file serving via
sendfilestraight from disk, ~10× cheaper than going through your app stack. - Compression. One
gzip on;turns 300 KB of JSON into 30 KB of bytes-on-the-wire. - Caching for endpoints that can be cached, with stale-while-revalidate so a slow upstream doesn't break the page.
- Rate limiting and connection limiting at the edge — the cheapest, dumbest brake on a hostile flood.
- HTTP/2 and HTTP/3 upgrades on your behalf; the backend talks plain HTTP/1.1.
- Health-checked load balancing across multiple backends or workers.
- Graceful reloads — change config, send
SIGHUP, no dropped connections.
The Configuration Model
NGINX configuration is a tree of nested contexts. Each context introduces directives that apply within it; many directives inherit downward unless overridden. Reading a config means knowing which context you're in.
- main — process-level. Worker count, run-as user, master pid file.
- events — per-worker connection settings.
- http — global HTTP behaviour: log format, gzip, timeouts, MIME types.
- server — one virtual host. Bind address, server_name, TLS cert paths.
- location — match within a server. The path-and-method routing layer.
A working starter config
upstream app_pool {
least_conn;
server 127.0.0.1:3000 max_fails=3 fail_timeout=10s;
server 127.0.0.1:3001 max_fails=3 fail_timeout=10s backup;
keepalive 32;
}
server {
listen 80;
server_name acme.com www.acme.com;
return 301 https://acme.com$request_uri; # canonical, force HTTPS
}
server {
listen 443 ssl http2;
server_name acme.com;
ssl_certificate /etc/letsencrypt/live/acme.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/acme.com/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
access_log /var/log/nginx/acme.access.log;
error_log /var/log/nginx/acme.error.log warn;
gzip on;
gzip_types text/plain text/css application/json application/javascript image/svg+xml;
# Static assets — served straight off disk
location /static/ {
alias /var/www/acme/static/;
expires 30d;
add_header Cache-Control "public, immutable";
try_files $uri =404;
}
# Everything else proxies to the app pool
location / {
proxy_pass http://app_pool;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_connect_timeout 5s;
proxy_send_timeout 30s;
proxy_read_timeout 30s;
proxy_buffering on;
}
}
Read it once and you'll see the whole pattern: an upstream pool, a redirect-to-HTTPS server, a TLS server with a static block and a proxy block. Ninety percent of production NGINX configs are variations on this.
Server Blocks — Multiple Sites on One Box
NGINX picks which server block handles a request by matching listen + server_name against the inbound Host header. You can host many domains on one IP because the HTTP Host header (and TLS SNI) carries which one the client wants.
- Exact match wins first:
server_name acme.com;. - Leading wildcard:
server_name *.acme.com;. - Regex:
server_name ~^(?<tenant>.+)\.acme\.com$;, with capture exposed as$tenant. - Default server: the one with
default_serveron itslistenhandles unmatchedHostheaders — typically returns 444 (close connection) to deny noise.
Drop unmatched hosts
server {
listen 80 default_server;
listen 443 ssl default_server;
ssl_certificate /etc/nginx/ssl/snakeoil.pem;
ssl_certificate_key /etc/nginx/ssl/snakeoil.key;
return 444; # nginx-specific: close without a response
}Location Matching — The Routing Layer
Inside a server, locations match against the request URI. The matching rules are a small DSL:
| Modifier | Meaning | Example |
|---|---|---|
= | Exact match (fastest) | location = /healthz |
^~ | Prefix; if it matches, skip regex | location ^~ /static/ |
~ | Case-sensitive regex | location ~ \.php$ |
~* | Case-insensitive regex | location ~* \.(jpg|png)$ |
| (none) | Plain prefix | location /api/ |
Order of evaluation: NGINX picks the longest matching = exact, then the longest ^~ prefix, then the first matching regex (in file order), and finally the longest plain prefix. Regex is order-dependent — list the most specific first.
proxy_pass changes everything. location /api/ { proxy_pass http://app/; } rewrites the URI — a request for /api/users arrives at the backend as /users. Without the trailing slash on proxy_pass, the URI is preserved and the backend sees /api/users. Pick deliberately and write a comment for the next reader.Upstreams and Load Balancing
An upstream block names a pool of backends. proxy_pass in a location can target the pool name. Three load-balancing strategies cover the common cases:
- round_robin (default) — even distribution.
- least_conn — send the next request to whoever has fewest in-flight. Good when request times vary.
- ip_hash — same client IP always goes to the same backend. The classical "sticky session" without cookie state. Use sparingly; it complicates scale-down.
Health checks (open-source vs commercial)
Open-source NGINX does passive health checks: it marks a backend as down after max_fails failures and stops sending traffic for fail_timeout seconds. NGINX Plus and the third-party nginx_upstream_check_module add active probes that hit a health endpoint on a schedule. For most teams, passive checks plus a separate health-monitoring system (Prometheus, the cloud LB, etc.) is fine.
Backend keepalive
By default, NGINX opens a fresh TCP connection to the backend for every request — wasteful when both sides live on the same box. keepalive 32; in the upstream tells NGINX to keep up to 32 idle connections per worker, reusing them across requests. Combine with proxy_http_version 1.1; and removing Connection header to make it work.
Performance Knobs That Matter
NGINX defaults are sane but not aggressive. The handful of settings that consistently move the needle:
worker_processes auto; # match CPU count
worker_rlimit_nofile 65535; # raise the FD ceiling
events {
worker_connections 4096;
multi_accept on; # accept many new conns per loop iteration
use epoll; # default on linux, but state it
}
http {
sendfile on; # zero-copy static files
tcp_nopush on; # group packets
tcp_nodelay on; # send small packets immediately
keepalive_timeout 65;
keepalive_requests 1000; # raise from default 100
client_max_body_size 25m; # match your largest legit upload
client_body_buffer_size 1m;
gzip on;
gzip_min_length 1024;
gzip_comp_level 5; # 1=fast, 9=best ratio; 5 is the sweet spot
gzip_types text/css application/javascript application/json image/svg+xml;
open_file_cache max=10000 inactive=60s;
open_file_cache_valid 60s;
open_file_cache_min_uses 2;
open_file_cache_errors on;
}
The biggest wins for most sites: gzip on, sendfile on, open_file_cache, and keepalive_requests 1000. The default keepalive_requests 100 closes the TCP connection after 100 requests — for an asset-heavy SPA that's a connection churn nobody wants.
Caching — The Honest Free Lunch
NGINX's proxy_cache stores responses on local disk so identical subsequent requests don't hit the backend at all. For endpoints that are cacheable, this is the most impactful single config you can write.
proxy_cache_path /var/cache/nginx/app levels=1:2 keys_zone=app_cache:50m
inactive=24h max_size=2g use_temp_path=off;
location / {
proxy_pass http://app_pool;
proxy_cache app_cache;
proxy_cache_key $scheme$proxy_host$request_uri;
proxy_cache_methods GET HEAD;
proxy_cache_valid 200 302 10m;
proxy_cache_valid 404 1m;
proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
proxy_cache_lock on; # collapse cache stampedes
add_header X-Cache-Status $upstream_cache_status;
}
Two ideas worth dwelling on:
proxy_cache_use_staleserves stale content if the backend returns 5xx. Critical for keeping a site up during a flaky deploy.proxy_cache_lockensures only one request actually goes to the backend on a cache miss; concurrent misses wait and reuse the result. Without this, a thundering-herd event can amplify a single miss into thousands of backend hits.
$cookie_session or $http_authorization, or simply don't cache authenticated routes. Watch X-Cache-Status in dev to see what's hitting and what's bypassing.Rate Limiting and Connection Limiting
Two lines stop a lot of damage. NGINX implements a token-bucket rate limiter at the IP or session granularity.
limit_req_zone $binary_remote_addr zone=basic:10m rate=20r/s;
limit_conn_zone $binary_remote_addr zone=conns:10m;
server {
location /login {
limit_req zone=basic burst=10 nodelay; # 20 r/s + small burst, then 503
}
location /api/ {
limit_req zone=basic burst=40 delay=20; # smooth, not strict
limit_conn conns 10; # max 10 simultaneous per IP
proxy_pass http://app_pool;
}
}
burstsets the bucket size; requests above the rate accumulate up toburst, beyond that NGINX returns 503.nodelayprocesses the burst immediately;delay=Nqueues the latter half — gentler on legitimate clients.- Behind a real LB or CDN,
$binary_remote_addris the LB's IP — useless. Use$http_x_forwarded_foronly after sanitising it (see below).
Trusting (or Not) the X-Forwarded-For Header
When NGINX sits behind another proxy (CloudFront, ALB, Cloudflare), the client's real IP is in X-Forwarded-For. To use it for rate limiting, logging, and access control, you need to tell NGINX which proxies to trust — otherwise a malicious client just sends X-Forwarded-For: 1.2.3.4 and lies about who they are.
set_real_ip_from 10.0.0.0/8; # the ALB/VPC range set_real_ip_from 173.245.48.0/20; # cloudflare ranges (publish their list) real_ip_header X-Forwarded-For; real_ip_recursive on; # walk the chain skipping our trusted IPs
Now $remote_addr is the real client IP and your rate-limit zone keyed on $binary_remote_addr works correctly.
WebSockets, SSE, and Streaming
Long-lived connections need a different default. Two changes from the standard proxy block:
location /ws/ {
proxy_pass http://app_pool;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 1h; # don't time out a long-lived stream
proxy_buffering off; # critical for SSE — no waiting for buffer fill
}
- Upgrade headers are how NGINX forwards the WebSocket protocol switch.
- proxy_read_timeout defaults to 60s; for long-lived streams that's a guaranteed disconnect.
- proxy_buffering off matters for SSE (Server-Sent Events) — buffering adds delay between when the backend writes a token and when the browser sees it.
Reload, Test, Don't Restart
Three commands manage NGINX in production:
sudo nginx -t # syntax check the config; ALWAYS run before reload
sudo systemctl reload nginx # graceful: workers finish in-flight, new conns hit new config
sudo systemctl restart nginx # hard restart; drops in-flight connections — last resort
# Tail the live access log
sudo tail -F /var/log/nginx/access.log | awk '{print $9, $7}' # status code + path
# Find recent 5xx
sudo grep -E ' 5[0-9]{2} ' /var/log/nginx/access.log | tailnginx -t firstreload is rejected and the old config keeps running. But running systemctl restart with a broken config will fail and leave the server down. Burn it in: sudo nginx -t && sudo systemctl reload nginx. The && short-circuits the reload if the test fails.Reading the Access Log
The default combined log format gives you everything you need to diagnose 80% of problems:
1.2.3.4 - - [02/May/2026:10:14:55 +0000] "GET /api/users/42 HTTP/2.0" 200 4321 "https://acme.com/dashboard" "Mozilla/5.0 …" rt=0.034 uct="0.001" uht="0.030" urt="0.030" # ↑ ↑ ↑ ↑ ↑ ↑ # status size total conn header upstream
Add $request_time $upstream_response_time to your log_format and 90% of perf debugging is grep + awk. "Are 5xx coming from us or from upstream?" — compare status against upstream_response_time: if blank, NGINX returned the error itself (timeout, rate limit); if present, the backend did.
Common Pitfalls
- 413 Request Entity Too Large. File upload bigger than
client_max_body_size(default 1m). Raise it to your real maximum. - 504 Gateway Timeout. The backend took longer than
proxy_read_timeout. Diagnose: is the backend genuinely slow, or did NGINX miss a long-poll/websocket need? - 502 Bad Gateway. Backend refused the connection or returned malformed data. Check the upstream is up; check that the backend's bind matches the upstream URL (localhost vs 0.0.0.0 trips many).
- Mixed content after enabling HTTPS. Your app generates absolute
http://URLs. SetX-Forwarded-Protoand have the app honour it; emit relative URLs where possible. - The 'works in dev' static path.
aliasandrootwork differently —aliasreplaces the matched location,rootappends the URI. Mixing them up gives 404s for files that exist on disk. - Logging IPs of the LB. See the X-Forwarded-For section.
set_real_ip_from+real_ip_headerare not optional behind a public CDN.
Show answer
client_max_body_size from the default 1m to whatever the real max is — say 50m for image uploads. It can sit at the http, server, or location level; the most defensive choice is server (or per-location for the upload endpoint). 2) Raise proxy_read_timeout for the AI endpoint specifically — default 60s is fine elsewhere, but a 30-second-and-occasionally-longer endpoint should have its own location with proxy_read_timeout 120s;. Don't raise it globally; you'd hide real backend problems on the rest of the site. Always run nginx -t before reloading.- main — the process.
- events — the workers.
- http — global HTTP behaviour.
- server — one virtual host.
- location — the route within the host.
proxy_pass + backend keepalive turn one box into a small load balancer for free. 4) Rate limits and caches are the two highest-leverage features most teams under-use. 5) Always nginx -t before reload; never systemctl restart if reload will do.- NGINX official documentationnginx.org
- Inside NGINX: how it scalesnginx.com
- NGINX admin guidedocs.nginx.com
- DigitalOcean — server & location matching algorithmdigitalocean.com
- h5bp — battle-tested nginx config snippetsgithub.com
Finished reading?