Scalability is the property of doing more without things breaking — or breaking gracefully when they finally do. Get the basics right (statelessness, the right caching layer, a database that can keep up) and most apps scale further than their builders expect.
← Back to Architecture| Vertical | Horizontal | |
|---|---|---|
| How | Bigger CPU/RAM/SSD | More instances behind a load balancer |
| Code changes | None | Stateless app, externalized session, distributed cache |
| Ceiling | What the cloud sells | Effectively unlimited (within reason) |
| Failure mode | One big machine fails | Many small failures, easier to absorb |
| Cost curve | Linear, then exponential | Roughly linear with operations overhead |
Default approach: scale vertically until it's awkward, scale horizontally for headroom and resilience. "Scale horizontally first" is fashionable advice that's wrong for most apps.
A stateless app scales by adding instances. A stateful app scales by suffering.
App tier scaling is the easy part. The database is where most apps run into a real ceiling.
EXPLAIN before scaling out.Saturation is inevitable. The question is what happens next.
Rough capacity walk for 10× and 100× growth.
| Stage | Bottleneck | Fix |
|---|---|---|
| 1× — small VM, single Postgres | None | Ship it. |
| 10× reads | Postgres CPU on resolves | Add Redis cache (cache-aside, jittered TTL). Now > 95% of reads never hit Postgres. |
| 100× reads | App-tier CPU on TLS & framework | Run multiple stateless instances behind a load balancer; sessions in Redis; CDN caches 302s for popular codes. |
| 10× writes (link creation bursts) | Postgres write IOPS, DLQ filling | Tune indexes; partition clicks by month; offload analytics to a queue + worker fleet. |
| Multi-region | Cross-region DB latency | Read replicas in each region; route reads locally; writes go to primary region. Accept eventual consistency for stats. |
| 1000× clicks | Single Postgres for clicks | Sample writes (don't store every click); aggregate in a stream processor (Kafka + Flink) into rollup tables; warm tier in OLAP store. |
Notice: each step adds one piece of complexity to remove one bottleneck. Don't ship multi-region sharded analytics on day one.