Observability & Performance Deep Dive · 7 of 7

Performance & Scalability — The Toolbox for Load

Slow systems lose users; broken-under-load systems lose customers. The fundamentals haven't changed: measure first, scale the right axis, cache as close to the user as possible, profile before you optimize, and protect your back-end with rate limits and backpressure when traffic spikes.

Vertical / HorizontalCachingProfilingLoad testingRate limitingBackpressure
← Back to Observability & Performance
Scaling

Vertical vs Horizontal

AxisHowWhen it fits
Vertical (scale up)Bigger box — more CPU, RAM, faster diskStateful systems (DBs); single-process workloads; quick wins.
Horizontal (scale out)More boxes behind a load balancerStateless services; web tiers; queue workers. Needs idempotency.
Read replicasScale reads only; writes to primaryRead-heavy DB workloads; analytics queries.
ShardingPartition data across nodes by a keyDatasets that outgrow a single primary; high write throughput.
Auto-scalingAdd/remove instances on metrics (CPU, queue depth, RPS)Spiky workloads; cost-sensitive cloud deployments.
Caching

Closer Is Cheaper

Browser cache
CDN edge
Reverse proxy
App-level cache (Redis / Memcached)
Database cache
Origin DB
  • Cache-aside (lazy load) — most common; check cache, miss → load DB → fill cache.
  • Write-through — write hits cache and DB synchronously; consistency at write cost.
  • Write-behind — write hits cache, async to DB; fast but risk on crash.
  • TTLs — short for personalized data; long for catalogs; never 0.
  • Cache invalidation — the hard part. Tag-based or event-driven beats hoping TTL is fast enough.
Profiling

Find Hot Spots Before Optimizing

  • CPU profilers — Linux perf, async-profiler (JVM), py-spy, Go pprof, .NET dotnet-trace.
  • Memory profilers — heap dumps + analyzers; track allocations not just usage.
  • SQL profilingEXPLAIN, slow-query logs, pg_stat_statements.
  • Continuous profiling — Pyroscope, Parca, Datadog/Granulate run profilers always-on; tie samples to traces.
  • Flame graphs — vertical = stack depth, horizontal = sample share. Wide bars = where to look.
  • "Make it work, make it right, make it fast." Profile first; intuition lies.
Load Testing

Find the Cliff Before Users Do

ToolStrength
k6JS scripts, Grafana-friendly, dev-ergonomic.
JMeterGUI, plugin-rich, mature; XML-heavy.
GatlingScala/Java DSL; great reports.
LocustPython; distributed; readable scenarios.
Artillery / Vegeta / wrkLightweight CLIs for quick HTTP load.

Run load tests (sustained), stress tests (find the breakpoint), soak tests (memory leaks under hours of traffic), and spike tests (sudden 10×). Test against prod-like data, not empty DBs.

Protect the Backend

Rate Limiting, Backpressure, Circuit Breakers

  • Rate limiting — cap requests per client / IP / API key. Token bucket and leaky bucket are the classics.
  • Backpressure — when a downstream is slow, slow down the upstream too; bounded queues, not infinite.
  • Circuit breakers — stop calling a failing dependency for a cooling period; recover progressively.
  • Bulkheads — partition resources so one bad dependency can't drown the whole process (separate thread pools / clients).
  • Timeouts everywhere. No timeout = waiting forever on a hung peer.
  • Retries with jitter. Naive retries cause synchronized thundering herds.
  • Load shedding. Drop low-priority traffic to keep critical paths alive.
Common Pitfalls

Things That Bite

  • p99, not average. Averages hide the slow tail; p99 latency is what your worst users feel.
  • The N+1 query. One outer + N inner DB calls per request — the most common silent killer.
  • Synchronous fan-out. 10 services in series = sum of 10 latencies. Parallelize what you can.
  • Cache stampedes. A hot key expiring → 1000 concurrent reloads. Use single-flight / probabilistic early refresh.
  • Premature horizontal scaling. A 4× bigger box is often cheaper and simpler than a fleet of small ones.
  • Optimizing without measurement. "I have a hunch" is how you spend a week speeding up code that runs once a day.
Continue

Other Observability Tools