Linux Servers & EC2 — The Box Your Code Runs On
Underneath every container, every Lambda, every PaaS abstraction is a Linux machine someone provisioned. Learn the parts of an EC2 instance you'll actually touch — AMIs, instance types, VPCs, security groups, key pairs, IAM roles — then the Linux survival kit: SSH, processes, services, file permissions, package management, and the hardening you do before the box is allowed near the public internet.
What you will learn
Cloud abstractions stack high. Underneath the Lambda is a Firecracker microVM, underneath the microVM is a server, underneath the server is the same kernel you'd boot on your laptop. Every once in a while — when a deploy goes sideways, when a process won't start, when a permission is wrong — you have to drop down to that machine and figure out what's actually happening. Today is the layer where that happens. We'll provision an EC2 instance from zero, walk through the AWS networking primitives that decide whether anyone can reach it, and learn enough Linux to keep a production server alive.
journalctl and basic shell tools. 5) Harden the box: SSH key-only auth, automatic security updates, and IAM role over static keys.The Anatomy of an EC2 Instance
EC2 — Elastic Compute Cloud — is AWS's name for "a virtual machine you rent by the second." Conceptually, an instance is six decisions wrapped together.
The six decisions
| Decision | What it controls | Beginner default |
|---|---|---|
| AMI | The boot image — OS, kernel, pre-installed software | Ubuntu 24.04 LTS or Amazon Linux 2023 |
| Instance type | vCPU, RAM, network, disk class | t3.small or t4g.small (2 vCPU, 2 GB) |
| VPC + subnet | Which network the box lives on, IP range, public vs private | Default VPC, public subnet |
| Security group | Stateful firewall rules — which ports are open, from where | 22 from your IP, 80 + 443 from anywhere |
| Key pair | The SSH keypair you'll use to log in initially | Generate one; store the .pem somewhere you'll find it |
| IAM role | Permissions the instance has to call other AWS services | One scoped to S3 read/write + CloudWatch logs, no AdministratorAccess |
AMIs — the boot image
An AMI (Amazon Machine Image) is a pre-built disk snapshot you boot from. Two flavours matter for production:
- Stock OS images: Ubuntu LTS, Amazon Linux, Debian. You install your stack post-boot via cloud-init, Ansible, or a config script. Easy to start, but every instance pays the install cost.
- Custom ("baked") AMIs: you snapshot a fully-configured machine and boot from that. Tools: Packer, EC2 Image Builder. Faster boots, immutable deploys, harder to update — but the right pattern for autoscaling groups.
Instance types — the alphabet soup
EC2 type names encode purpose and generation. t3.small reads as: t family (burstable general-purpose), 3rd gen, small size (2 vCPU, 2 GB). The families:
| Family | For | Notes |
|---|---|---|
| t (t3, t4g) | Burstable general-purpose | Cheap, but throttles after CPU credits run out — not for steady-state CPU loads |
| m (m6i, m7g) | Steady general-purpose | The default for production app servers |
| c (c7i, c7g) | Compute-optimized | More CPU per dollar; for CPU-bound workloads (encoding, ML inference) |
| r (r7i, r7g) | Memory-optimized | For databases, caches, in-memory workloads |
| i / d | Storage-optimized | NVMe SSD or HDD per VM; for big-data and self-hosted DBs |
| g / p | GPU | Inference (g) or training (p) |
The g suffix | Graviton (ARM) | ~20% cheaper than the x86 equivalent for the same perf — use when your stack supports ARM |
t3.small earns CPU credits at a base rate (20% of one core, in this case) and spends them when busy. Run hot for too long and credits run out, and the instance is throttled to the base rate. Fine for a low-traffic site or a CI worker; lethal for a busy app server. Watch CPUCreditBalance in CloudWatch — if it trends down, upgrade family.Networking — VPCs, Subnets, and the Firewall
Every EC2 instance lives in a VPC (Virtual Private Cloud), which is a private network you own a slice of. AWS gives you a default VPC in every region; for serious work you'll create your own.
The pieces in order
- VPC. A CIDR block like
10.0.0.0/16— 65k addresses you control. - Subnets. Slices of the VPC bound to a single Availability Zone. Public subnets have a route to an Internet Gateway and instances there can have public IPs. Private subnets don't, so instances there can only reach the internet via a NAT Gateway (outbound) and never receive inbound public traffic.
- Internet Gateway (IGW). The wire from the VPC to the public internet. Attached at the VPC level; subnet-level routing decides who uses it.
- NAT Gateway. Lets private-subnet instances reach the internet outbound (for package downloads, API calls) without being reachable inbound.
- Route tables. The rules that send traffic the right way. Public subnets route
0.0.0.0/0to the IGW; private subnets route0.0.0.0/0to the NAT. - Security groups. Per-instance stateful firewalls. "Allow TCP 22 from
1.2.3.4/32; allow 443 from0.0.0.0/0; allow all outbound." - Network ACLs. Per-subnet stateless firewalls. Most teams leave them open and rely on security groups; they exist for compliance overlays.
The mental model that prevents 80% of mistakes
For a normal web app, the layout is:
- Public subnet in 2+ AZs holds your load balancer (ALB) and optionally bastion hosts.
- Private app subnet in those same AZs holds your EC2 app instances. They have no public IPs; the ALB forwards traffic to them.
- Private data subnet holds RDS, ElastiCache, etc. Only reachable from the app subnet.
For a learning deployment with one box, you put the EC2 directly in a public subnet with a public IP. That's fine for day one — just remember it means the box is on the public internet, and the security group is the only thing between it and the world's port scanners.
app-sg" is far better than "Allow 5432 from 10.0.1.0/24" — the former tracks membership automatically as instances come and go, the latter goes stale silently. Always reference SGs by name when both sides are in your VPC.SSH — Your First Login
SSH (Secure Shell) is a remote login protocol that uses public-key authentication. You generate a keypair locally; the public key goes on the server; you log in by proving you hold the private key. Two key facts:
- Private key never leaves your machine. If it does, treat it as compromised.
- Permissions matter. SSH refuses to use a private key whose file mode is too permissive.
chmod 600 ~/.ssh/id_ed25519.
# 1) Generate a modern key (ed25519 — small, fast, secure) ssh-keygen -t ed25519 -C "sumit@laptop" -f ~/.ssh/id_ed25519 # 2) Provision the EC2 with the corresponding public key (paste into AWS console, # or upload via aws cli): aws ec2 import-key-pair --key-name sumit-laptop \ --public-key-material fileb://~/.ssh/id_ed25519.pub # 3) Log in (Ubuntu uses the user 'ubuntu'; Amazon Linux uses 'ec2-user') ssh -i ~/.ssh/id_ed25519 ubuntu@203.0.113.42 # Tidier: configure ~/.ssh/config so you can `ssh acme-prod` cat >> ~/.ssh/config <<'EOF' Host acme-prod HostName 203.0.113.42 User ubuntu IdentityFile ~/.ssh/id_ed25519 IdentitiesOnly yes EOF
SSH agents and forwarding
Type ssh-add ~/.ssh/id_ed25519 once and your local SSH agent holds the unlocked key for the session — no more passphrase prompts. Avoid agent forwarding (-A) unless you trust the destination box completely; a compromised server with your forwarded agent can sign auth requests as you. Prefer ProxyJump (-J bastion) — your private key never leaves your laptop.
fail2ban helps but isn't enough. Edit /etc/ssh/sshd_config.d/00-hardening.conf: PasswordAuthentication no, PermitRootLogin no, ChallengeResponseAuthentication no, then sudo systemctl reload ssh. From now on, only key-holders can log in. Log out and verify with a fresh terminal before closing the existing session — recovery from a misconfigured sshd is painful.The Linux Survival Kit
Once you're in, you're staring at a Bourne-Again Shell prompt and a few decades of accumulated UNIX. Five concept clusters cover most of what a deploying engineer needs.
1. The filesystem hierarchy
| Path | What's there |
|---|---|
/etc | System configuration. Edit with care; restart services after. |
/var/log | Logs — application, system, kernel. |
/var/lib | Persistent state — package data, database files. |
/usr/local | Software you install outside the package manager. |
/opt | Vendor-provided large applications. |
/home/<user> | Per-user files. Your ~. |
/tmp | Scratch — wiped on reboot. Don't put anything important here. |
/proc · /sys | Kernel-exposed virtual filesystems. Where top reads from. |
2. Users, groups, permissions
Every file has an owner, a group, and three permission triplets (read/write/execute) for owner / group / others. Read it as rwxr-xr-x = 755:
# Inspect ls -l /etc/nginx/nginx.conf # -rw-r--r-- 1 root root 1234 ... # owner = root, group = root, mode 644 # Change sudo chown deploy:deploy /var/www/app # owner deploy, group deploy sudo chmod 750 /var/www/app # rwxr-x--- # Octal cheat sheet # r=4 w=2 x=1 → 7=rwx, 6=rw-, 5=r-x, 4=r--, 0=---
Two patterns to know: files used by a service should be owned by that service's user (e.g., www-data for nginx-served files), and secrets should be mode 600 (rw-------) so only the owner can read them. sudo elevates to root for one command; never su - into a long root shell unless you have to.
3. Processes and services
Every running program is a process with a PID. Tools for inspecting:
ps aux | grep nginx # all processes matching nginx top / htop # live, sorted by CPU/RAM ss -tlnp # listening TCP sockets + which process owns them lsof -i :80 # what's bound to port 80 kill -TERM <pid> # polite shutdown kill -9 <pid> # last resort; bypasses cleanup
For services that must outlive your shell, you don't run them with ./myapp & — you write a systemd unit (Day 5). For now: systemctl status nginx, systemctl restart nginx, systemctl enable nginx (start at boot).
4. Logs
Two main log surfaces:
- journald — systemd's binary log database. Query with
journalctl -u nginx -f(follow nginx logs) orjournalctl -p err --since "1 hour ago". - Plaintext under
/var/log—/var/log/syslog,/var/log/nginx/access.log, etc. Read withtail -f,less +F,grep.
# When a service won't start journalctl -u myapp.service -n 200 --no-pager # When you suspect disk pressure df -h # disk usage by mount du -sh /var/log/* | sort -h # biggest log files # When something's slow but you don't know what top -o %CPU # who's burning CPU iostat 1 # disk activity ss -s # connection counts # When the service is hot but the box looks idle strace -p <pid> -c -f # what syscalls is it making?
5. Package management
Ubuntu/Debian use apt; Amazon Linux 2023 and Fedora use dnf. Pin versions, prefer the OS package over curl | sh when you can.
sudo apt update # refresh package index sudo apt install -y nginx # install nginx apt list --installed | grep nginx # check what's there sudo apt upgrade # upgrade everything sudo unattended-upgrades --dry-run # what auto-updates would do
IAM Roles — The End of Long-Lived AWS Keys
The single biggest day-2 security upgrade for an EC2 box: never put AWS access keys on the instance. Attach an IAM role to the instance instead. The role grants permissions to call AWS APIs (read this S3 bucket, write to that DynamoDB table, fetch this Secrets Manager secret), and the EC2 instance metadata service rotates short-lived credentials automatically.
Always require IMDSv2
The instance metadata service has two versions. v1 is the original, GET-only, no auth. v2 requires a session token, defends against SSRF attacks (where an app vulnerable to URL-fetch could be tricked into hitting the metadata endpoint and exfiltrating credentials). New instances should always require IMDSv2:
aws ec2 modify-instance-metadata-options \ --instance-id i-0a3f… \ --http-tokens required \ --http-endpoint enabled
Hardening Checklist for the First 30 Minutes
Before the box runs anything important, do this once. The boring list separates a hobby project from a production posture.
- SSH keys only.
PasswordAuthentication no,PermitRootLogin no. - SSH from your IP, not the world. Restrict the security group's port 22 to your office/VPN CIDR. For one-off access from anywhere, use SSM Session Manager (no port 22 needed at all).
- Automatic security updates.
sudo apt install unattended-upgrades; sudo dpkg-reconfigure --priority=low unattended-upgrades. - IAM role over keys. Attach an instance profile; remove any
~/.aws/credentialsfile you might have left behind. - IMDSv2 required. One CLI call, infinitely worth it.
- Sensible time sync.
chronyon by default; verify withchronyc tracking. TLS, logs, auth tokens all go wrong with skewed clocks. - Limited inbound. SG opens only what's needed: 80, 443, sometimes 22. Outbound is usually wide-open; restrict if you have a compliance reason.
- Known-good base AMI. Prefer the official Ubuntu / Amazon Linux AMIs from the AWS Marketplace; community AMIs are unverified.
- EBS encryption. Default-on at the account level:
aws ec2 enable-ebs-encryption-by-default. Costs nothing. - CloudWatch agent. Sends metrics + logs to CloudWatch out of the box; you'll need it on Day 7 anyway.
From Empty Box to Live Service — End-to-End
Putting it all together, the path from "nothing" to "a service answering on port 80" looks like this:
- Provision a t3.small Ubuntu 24.04 instance in a public subnet with an SG allowing 22 from your IP and 80/443 from anywhere. Attach an instance profile.
- SSH in as
ubuntu. Apply hardening: disable password auth, enable unattended upgrades, require IMDSv2. - Create a deploy user:
sudo adduser deploy --disabled-password, copy your SSH key into/home/deploy/.ssh/authorized_keyswith mode 600, add it to thesudogroup only if it needs to. - Install the runtime —
apt install nginx, language runtimes, etc. Day 3 covers nginx in depth; Day 5 covers running your app as a systemd service; Day 6 covers doing all of this in a container instead. - Open the right port in the SG. Verify
ss -tlnpshows your service listening, thencurlfrom your laptop. - Point DNS at the public IP (or assign an Elastic IP first so it survives a stop/start).
EC2 Alternatives Worth Knowing
Pure EC2 is the most flexible option — and the most labour-intensive. Several adjacent AWS products handle parts of the work for you:
| Service | What you trade | Best for |
|---|---|---|
| Lightsail | Less control, simpler pricing | Personal sites, demos, dev boxes |
| Elastic Beanstalk | Less control, opinionated env | Quick deploy of a single app |
| ECS / Fargate | Containers, no servers to patch | Production microservices once you've done Day 6 |
| EKS | Full Kubernetes, full complexity | Large fleets with platform teams |
| Lambda | Event-driven, no long-running processes | Glue code, event handlers, sporadic workloads |
| App Runner | Container PaaS | Single-service deployments, autoscaling for free |
The skill of running an EC2 box transfers to all of them — the Linux underneath ECS Fargate is the same Linux. Master the box, then move up the abstraction ladder when the cost-of-ops outweighs the cost-of-control.
Show answer
22 only from your specific public IP, not the office CIDR. Test: aws ec2 describe-security-groups --group-ids sg-… and inspect IpPermissions. Fix: change the source to the office CIDR or set up SSM Session Manager. 2) SSH key. Your colleague's public key isn't in ~/.ssh/authorized_keys on the box. Test: ssh -v <user>@<ip> from their laptop — verbose output shows whether the SG dropped the connection (timeout) or sshd rejected the auth ("Permission denied (publickey)"). The two cases look different on the wire and the fix is correspondingly different.- AMI — the boot image.
- Type — vCPU, RAM, network shape.
- VPC + subnet — what network it lives on.
- SG — who can talk to it.
- Key — how you log in the first time.
- Role — what it can talk to.
s3:GetObject using credentials in ~/.aws/credentials. The on-call engineer wants to remove the credentials file because the instance is now compromised. What change must precede that, and why is the IAM-role pattern strictly better?- AWS — EC2 user guidedocs.aws.amazon.com
- AWS — VPC user guidedocs.aws.amazon.com
- man sshd_config — every option, with commentsman7.org
- Linux Survival — interactive shell tutoriallinux-survival.com
- AWS — IMDSv2 configurationdocs.aws.amazon.com
- Slim — generic Linux hardening referencesgithub.com
Finished reading?