Slow systems lose users; broken-under-load systems lose customers. The fundamentals haven't changed: measure first, scale the right axis, cache as close to the user as possible, profile before you optimize, and protect your back-end with rate limits and backpressure when traffic spikes.
← Back to Observability & Performance| Axis | How | When it fits |
|---|---|---|
| Vertical (scale up) | Bigger box — more CPU, RAM, faster disk | Stateful systems (DBs); single-process workloads; quick wins. |
| Horizontal (scale out) | More boxes behind a load balancer | Stateless services; web tiers; queue workers. Needs idempotency. |
| Read replicas | Scale reads only; writes to primary | Read-heavy DB workloads; analytics queries. |
| Sharding | Partition data across nodes by a key | Datasets that outgrow a single primary; high write throughput. |
| Auto-scaling | Add/remove instances on metrics (CPU, queue depth, RPS) | Spiky workloads; cost-sensitive cloud deployments. |
0.EXPLAIN, slow-query logs, pg_stat_statements.| Tool | Strength |
|---|---|
| k6 | JS scripts, Grafana-friendly, dev-ergonomic. |
| JMeter | GUI, plugin-rich, mature; XML-heavy. |
| Gatling | Scala/Java DSL; great reports. |
| Locust | Python; distributed; readable scenarios. |
| Artillery / Vegeta / wrk | Lightweight CLIs for quick HTTP load. |
Run load tests (sustained), stress tests (find the breakpoint), soak tests (memory leaks under hours of traffic), and spike tests (sudden 10×). Test against prod-like data, not empty DBs.