The Engineering Codex/From Code to Internet: Deployment & Operations
DAY 3
03 / 07

NGINX — The Front Door

schedule13 minsignal_cellular_altIntermediate2,935 words
An app server that talks directly to the internet is an app server with too many jobs. NGINX takes the dirty work — TLS, static files, gzip, slow clients, rate limits, websocket upgrades, request buffering — and lets your app focus on its actual logic. Master server blocks, location matching, upstream pools, proxy_pass, caching, and the small set of tuning knobs that turn 'works on my machine' into 'serves 50k requests per second on a t3.small.'

What you will learn

01Why a Reverse Proxy
02The Configuration Model
03Server Blocks — Multiple Sites on One Box
04Location Matching — The Routing Layer
05Upstreams and Load Balancing
06Performance Knobs That Matter

You can technically point DNS at a Node app on port 80 and call it production. People do this every day, and they spend the next month learning why nobody does it on purpose. The app server is good at running your code; it is bad — and not its fault — at TLS termination, slow client buffering, gzipping responses, serving 80 MB of static images, holding 10,000 idle keepalive connections, and rate-limiting a botnet. NGINX is the small, fast, c-language process whose entire job is being good at those things, so your app can do the one thing only it can do: run your code.

🔑
Today's plan
1) Why a reverse proxy belongs in front of every production app. 2) The configuration model — events, http, server, location — and how directives inherit. 3) Upstream pools and proxy_pass — load-balancing your backend. 4) Performance knobs — gzip, caching, keepalive, sendfile. 5) Defences — rate limiting, connection limiting, request buffering. 6) WebSockets and SSE — what changes for streaming. 7) Common pitfalls and how to read access logs to find them.

Why a Reverse Proxy

A reverse proxy is just a server that takes inbound requests and forwards them to other servers, returning their responses. From the client's point of view it is the service. From the backend's point of view, the proxy is the only client it ever speaks to. Putting NGINX in front of your app — even for a single-box deployment — buys you a stack of features for almost no operational cost.

Internetbrowsers, bots NGINXTLS · gzip · cacherate-limit · staticproxy_pass App :3000node / python App :3001backup worker /var/www/staticimages, css One door from the internet, three things behind it. NGINX is the polite host that handles the entry hall.
A typical NGINX deployment: terminate TLS, route by path, serve static assets directly, proxy dynamic to the app.

What you get for free by adding NGINX

  • TLS termination in a battle-tested implementation, with sane defaults you don't have to maintain in your app.
  • Slow-client buffering. A user on a 2G connection takes seconds to send a POST body. NGINX buffers the request and hands the complete one to your app — your app's worker is busy for milliseconds, not seconds.
  • Static file serving via sendfile straight from disk, ~10× cheaper than going through your app stack.
  • Compression. One gzip on; turns 300 KB of JSON into 30 KB of bytes-on-the-wire.
  • Caching for endpoints that can be cached, with stale-while-revalidate so a slow upstream doesn't break the page.
  • Rate limiting and connection limiting at the edge — the cheapest, dumbest brake on a hostile flood.
  • HTTP/2 and HTTP/3 upgrades on your behalf; the backend talks plain HTTP/1.1.
  • Health-checked load balancing across multiple backends or workers.
  • Graceful reloads — change config, send SIGHUP, no dropped connections.
💡
NGINX or Caddy or Traefik or HAProxy or Envoy?
All of these are good. NGINX is the workhorse default — most production deployments worldwide. Caddy is opinionated and gets TLS automation right by default; great for small deployments. Traefik auto-configures from Docker/K8s labels; great inside container orchestrators. HAProxy is the king of pure load balancing and very high RPS. Envoy is the modern data-plane for service meshes (Istio, Linkerd). Pick the one whose ergonomics you'll actually maintain. We learn NGINX because the model transfers and the documentation is enormous.

The Configuration Model

NGINX configuration is a tree of nested contexts. Each context introduces directives that apply within it; many directives inherit downward unless overridden. Reading a config means knowing which context you're in.

main { worker_processes auto; user www-data; … } events { worker_connections 4096; } http { gzip on; log_format … ; … } server { listen 443 ssl; server_name acme.com; … } location /api/ { proxy_pass http://app_pool; }
Five contexts of NGINX. Inheritance flows top-down; an override at a lower scope wins for that scope.
  • main — process-level. Worker count, run-as user, master pid file.
  • events — per-worker connection settings.
  • http — global HTTP behaviour: log format, gzip, timeouts, MIME types.
  • server — one virtual host. Bind address, server_name, TLS cert paths.
  • location — match within a server. The path-and-method routing layer.

A working starter config

nginx — /etc/nginx/sites-enabled/acme
upstream app_pool {
    least_conn;
    server 127.0.0.1:3000 max_fails=3 fail_timeout=10s;
    server 127.0.0.1:3001 max_fails=3 fail_timeout=10s backup;
    keepalive 32;
}

server {
    listen 80;
    server_name acme.com www.acme.com;
    return 301 https://acme.com$request_uri;     # canonical, force HTTPS
}

server {
    listen 443 ssl http2;
    server_name acme.com;

    ssl_certificate     /etc/letsencrypt/live/acme.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/acme.com/privkey.pem;
    ssl_protocols       TLSv1.2 TLSv1.3;
    ssl_ciphers         HIGH:!aNULL:!MD5;

    access_log  /var/log/nginx/acme.access.log;
    error_log   /var/log/nginx/acme.error.log warn;

    gzip on;
    gzip_types text/plain text/css application/json application/javascript image/svg+xml;

    # Static assets — served straight off disk
    location /static/ {
        alias /var/www/acme/static/;
        expires 30d;
        add_header Cache-Control "public, immutable";
        try_files $uri =404;
    }

    # Everything else proxies to the app pool
    location / {
        proxy_pass         http://app_pool;
        proxy_http_version 1.1;
        proxy_set_header   Host              $host;
        proxy_set_header   X-Real-IP         $remote_addr;
        proxy_set_header   X-Forwarded-For   $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto $scheme;
        proxy_connect_timeout  5s;
        proxy_send_timeout     30s;
        proxy_read_timeout     30s;
        proxy_buffering        on;
    }
}

Read it once and you'll see the whole pattern: an upstream pool, a redirect-to-HTTPS server, a TLS server with a static block and a proxy block. Ninety percent of production NGINX configs are variations on this.

Server Blocks — Multiple Sites on One Box

NGINX picks which server block handles a request by matching listen + server_name against the inbound Host header. You can host many domains on one IP because the HTTP Host header (and TLS SNI) carries which one the client wants.

  • Exact match wins first: server_name acme.com;.
  • Leading wildcard: server_name *.acme.com;.
  • Regex: server_name ~^(?<tenant>.+)\.acme\.com$;, with capture exposed as $tenant.
  • Default server: the one with default_server on its listen handles unmatched Host headers — typically returns 444 (close connection) to deny noise.

Drop unmatched hosts

nginx — refuse connections that lie about the Host header
server {
    listen 80 default_server;
    listen 443 ssl default_server;
    ssl_certificate     /etc/nginx/ssl/snakeoil.pem;
    ssl_certificate_key /etc/nginx/ssl/snakeoil.key;
    return 444;                  # nginx-specific: close without a response
}

Location Matching — The Routing Layer

Inside a server, locations match against the request URI. The matching rules are a small DSL:

ModifierMeaningExample
=Exact match (fastest)location = /healthz
^~Prefix; if it matches, skip regexlocation ^~ /static/
~Case-sensitive regexlocation ~ \.php$
~*Case-insensitive regexlocation ~* \.(jpg|png)$
(none)Plain prefixlocation /api/

Order of evaluation: NGINX picks the longest matching = exact, then the longest ^~ prefix, then the first matching regex (in file order), and finally the longest plain prefix. Regex is order-dependent — list the most specific first.

⚠️
The trailing slash on proxy_pass
A trailing slash on proxy_pass changes everything. location /api/ { proxy_pass http://app/; } rewrites the URI — a request for /api/users arrives at the backend as /users. Without the trailing slash on proxy_pass, the URI is preserved and the backend sees /api/users. Pick deliberately and write a comment for the next reader.

Upstreams and Load Balancing

An upstream block names a pool of backends. proxy_pass in a location can target the pool name. Three load-balancing strategies cover the common cases:

  • round_robin (default) — even distribution.
  • least_conn — send the next request to whoever has fewest in-flight. Good when request times vary.
  • ip_hash — same client IP always goes to the same backend. The classical "sticky session" without cookie state. Use sparingly; it complicates scale-down.

Health checks (open-source vs commercial)

Open-source NGINX does passive health checks: it marks a backend as down after max_fails failures and stops sending traffic for fail_timeout seconds. NGINX Plus and the third-party nginx_upstream_check_module add active probes that hit a health endpoint on a schedule. For most teams, passive checks plus a separate health-monitoring system (Prometheus, the cloud LB, etc.) is fine.

Backend keepalive

By default, NGINX opens a fresh TCP connection to the backend for every request — wasteful when both sides live on the same box. keepalive 32; in the upstream tells NGINX to keep up to 32 idle connections per worker, reusing them across requests. Combine with proxy_http_version 1.1; and removing Connection header to make it work.

Performance Knobs That Matter

NGINX defaults are sane but not aggressive. The handful of settings that consistently move the needle:

nginx — performance baseline
worker_processes auto;                # match CPU count
worker_rlimit_nofile 65535;           # raise the FD ceiling

events {
    worker_connections 4096;
    multi_accept on;                  # accept many new conns per loop iteration
    use epoll;                        # default on linux, but state it
}

http {
    sendfile      on;                 # zero-copy static files
    tcp_nopush    on;                 # group packets
    tcp_nodelay   on;                 # send small packets immediately
    keepalive_timeout 65;
    keepalive_requests 1000;          # raise from default 100

    client_max_body_size 25m;         # match your largest legit upload
    client_body_buffer_size 1m;

    gzip on;
    gzip_min_length 1024;
    gzip_comp_level 5;                # 1=fast, 9=best ratio; 5 is the sweet spot
    gzip_types text/css application/javascript application/json image/svg+xml;

    open_file_cache max=10000 inactive=60s;
    open_file_cache_valid 60s;
    open_file_cache_min_uses 2;
    open_file_cache_errors on;
}

The biggest wins for most sites: gzip on, sendfile on, open_file_cache, and keepalive_requests 1000. The default keepalive_requests 100 closes the TCP connection after 100 requests — for an asset-heavy SPA that's a connection churn nobody wants.

Caching — The Honest Free Lunch

NGINX's proxy_cache stores responses on local disk so identical subsequent requests don't hit the backend at all. For endpoints that are cacheable, this is the most impactful single config you can write.

nginx — cache GETs that opt-in via Cache-Control
proxy_cache_path /var/cache/nginx/app levels=1:2 keys_zone=app_cache:50m
                 inactive=24h max_size=2g use_temp_path=off;

location / {
    proxy_pass         http://app_pool;
    proxy_cache        app_cache;
    proxy_cache_key    $scheme$proxy_host$request_uri;
    proxy_cache_methods GET HEAD;
    proxy_cache_valid  200 302 10m;
    proxy_cache_valid  404      1m;
    proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
    proxy_cache_lock   on;                       # collapse cache stampedes
    add_header X-Cache-Status $upstream_cache_status;
}

Two ideas worth dwelling on:

  • proxy_cache_use_stale serves stale content if the backend returns 5xx. Critical for keeping a site up during a flaky deploy.
  • proxy_cache_lock ensures only one request actually goes to the backend on a cache miss; concurrent misses wait and reuse the result. Without this, a thundering-herd event can amplify a single miss into thousands of backend hits.
💡
Cache keys are everything
If your endpoint varies by user (logged-in flag, session, locale) and your cache key doesn't, you'll serve user A's response to user B. Standard fix: scope the key by adding $cookie_session or $http_authorization, or simply don't cache authenticated routes. Watch X-Cache-Status in dev to see what's hitting and what's bypassing.

Rate Limiting and Connection Limiting

Two lines stop a lot of damage. NGINX implements a token-bucket rate limiter at the IP or session granularity.

nginx — slow them down at the door
limit_req_zone  $binary_remote_addr zone=basic:10m  rate=20r/s;
limit_conn_zone $binary_remote_addr zone=conns:10m;

server {
    location /login {
        limit_req  zone=basic burst=10 nodelay;     # 20 r/s + small burst, then 503
    }
    location /api/ {
        limit_req  zone=basic burst=40 delay=20;    # smooth, not strict
        limit_conn conns 10;                         # max 10 simultaneous per IP
        proxy_pass http://app_pool;
    }
}
  • burst sets the bucket size; requests above the rate accumulate up to burst, beyond that NGINX returns 503.
  • nodelay processes the burst immediately; delay=N queues the latter half — gentler on legitimate clients.
  • Behind a real LB or CDN, $binary_remote_addr is the LB's IP — useless. Use $http_x_forwarded_for only after sanitising it (see below).

Trusting (or Not) the X-Forwarded-For Header

When NGINX sits behind another proxy (CloudFront, ALB, Cloudflare), the client's real IP is in X-Forwarded-For. To use it for rate limiting, logging, and access control, you need to tell NGINX which proxies to trust — otherwise a malicious client just sends X-Forwarded-For: 1.2.3.4 and lies about who they are.

nginx — trust upstream proxy headers
set_real_ip_from 10.0.0.0/8;          # the ALB/VPC range
set_real_ip_from 173.245.48.0/20;     # cloudflare ranges (publish their list)
real_ip_header   X-Forwarded-For;
real_ip_recursive on;                  # walk the chain skipping our trusted IPs

Now $remote_addr is the real client IP and your rate-limit zone keyed on $binary_remote_addr works correctly.

WebSockets, SSE, and Streaming

Long-lived connections need a different default. Two changes from the standard proxy block:

nginx — websocket-ready
location /ws/ {
    proxy_pass         http://app_pool;
    proxy_http_version 1.1;
    proxy_set_header   Upgrade    $http_upgrade;
    proxy_set_header   Connection "upgrade";
    proxy_read_timeout 1h;             # don't time out a long-lived stream
    proxy_buffering    off;            # critical for SSE — no waiting for buffer fill
}
  • Upgrade headers are how NGINX forwards the WebSocket protocol switch.
  • proxy_read_timeout defaults to 60s; for long-lived streams that's a guaranteed disconnect.
  • proxy_buffering off matters for SSE (Server-Sent Events) — buffering adds delay between when the backend writes a token and when the browser sees it.

Reload, Test, Don't Restart

Three commands manage NGINX in production:

bash — the only nginx commands you need most days
sudo nginx -t                              # syntax check the config; ALWAYS run before reload
sudo systemctl reload nginx                # graceful: workers finish in-flight, new conns hit new config
sudo systemctl restart nginx               # hard restart; drops in-flight connections — last resort

# Tail the live access log
sudo tail -F /var/log/nginx/access.log | awk '{print $9, $7}'   # status code + path

# Find recent 5xx
sudo grep -E ' 5[0-9]{2} ' /var/log/nginx/access.log | tail
🚨
Always nginx -t first
An invalid config that's already loaded keeps running; an invalid config issued via reload is rejected and the old config keeps running. But running systemctl restart with a broken config will fail and leave the server down. Burn it in: sudo nginx -t && sudo systemctl reload nginx. The && short-circuits the reload if the test fails.

Reading the Access Log

The default combined log format gives you everything you need to diagnose 80% of problems:

log — anatomy of a line
1.2.3.4 - - [02/May/2026:10:14:55 +0000] "GET /api/users/42 HTTP/2.0" 200 4321 "https://acme.com/dashboard" "Mozilla/5.0 …" rt=0.034 uct="0.001" uht="0.030" urt="0.030"
#                                                                            ↑    ↑                                              ↑       ↑       ↑       ↑
#                                                                          status size                                          total  conn    header  upstream

Add $request_time $upstream_response_time to your log_format and 90% of perf debugging is grep + awk. "Are 5xx coming from us or from upstream?" — compare status against upstream_response_time: if blank, NGINX returned the error itself (timeout, rate limit); if present, the backend did.

Common Pitfalls

  1. 413 Request Entity Too Large. File upload bigger than client_max_body_size (default 1m). Raise it to your real maximum.
  2. 504 Gateway Timeout. The backend took longer than proxy_read_timeout. Diagnose: is the backend genuinely slow, or did NGINX miss a long-poll/websocket need?
  3. 502 Bad Gateway. Backend refused the connection or returned malformed data. Check the upstream is up; check that the backend's bind matches the upstream URL (localhost vs 0.0.0.0 trips many).
  4. Mixed content after enabling HTTPS. Your app generates absolute http:// URLs. Set X-Forwarded-Proto and have the app honour it; emit relative URLs where possible.
  5. The 'works in dev' static path. alias and root work differently — alias replaces the matched location, root appends the URI. Mixing them up gives 404s for files that exist on disk.
  6. Logging IPs of the LB. See the X-Forwarded-For section. set_real_ip_from + real_ip_header are not optional behind a public CDN.
Quick check
After putting NGINX in front of an existing Node app, image uploads bigger than 1 MB suddenly fail with 413, and concurrent users start hitting timeouts on a 30-second AI generation endpoint. What two config changes fix these, and where do they belong in the file?
Show answer
1) Raise client_max_body_size from the default 1m to whatever the real max is — say 50m for image uploads. It can sit at the http, server, or location level; the most defensive choice is server (or per-location for the upload endpoint). 2) Raise proxy_read_timeout for the AI endpoint specifically — default 60s is fine elsewhere, but a 30-second-and-occasionally-longer endpoint should have its own location with proxy_read_timeout 120s;. Don't raise it globally; you'd hide real backend problems on the rest of the site. Always run nginx -t before reloading.
Mnemonic — the NGINX config tree
"main · events · http · server · location."
  • main — the process.
  • events — the workers.
  • http — global HTTP behaviour.
  • server — one virtual host.
  • location — the route within the host.
Flashcard
Why does putting NGINX in front of a Node.js app make the app capable of handling more concurrent slow clients, even though Node already uses an event loop?
Click to flip ↻
Answer
Slow-client buffering. A user on a 2G connection can take seconds to dribble in a POST body or pull a large response. Without NGINX, your Node process holds a socket open for every such client — fine for a few hundred, costly for a few thousand. NGINX, written in C with epoll/kqueue, can hold tens of thousands of slow connections per worker for almost no memory; it accepts the full request before forwarding to Node, then accepts the full response before drip-feeding it back to the client. Node only sees fast LAN-speed traffic from NGINX, freeing it to do the actual work the only way it can: one event-loop iteration at a time. The pattern: let the proxy take the slow part, let the app take the smart part.
🔑
Key takeaways
1) NGINX's job is the boring, fast, defensive work the app shouldn't do — TLS, gzip, static, slow-client buffering, rate limits. 2) The config is a nested tree: main / events / http / server / location, with directives inheriting downward. 3) Upstreams + proxy_pass + backend keepalive turn one box into a small load balancer for free. 4) Rate limits and caches are the two highest-leverage features most teams under-use. 5) Always nginx -t before reload; never systemctl restart if reload will do.

Finished reading?