DevOps & CI/CD Deep Dive · 12 of 18

Service Mesh — Networking as a Platform Concern

A service mesh intercepts pod-to-pod traffic — historically with a sidecar proxy, increasingly with eBPF — to enforce mTLS, route by header, retry on failure, shed load, and emit golden-signal metrics. Apps stay simple; the mesh handles the rest.

IstioLinkerdCiliumEnvoymTLSL7 routing
← Back to DevOps & CI/CD
What It Does

The Three Capabilities

Basic Concepts

  • Security — automatic mTLS between services, identity-based authz policies (workload X may call workload Y).
  • Traffic management — header-based routing, weighted splits (canary), retries, timeouts, circuit breakers, fault injection.
  • Observability — uniform request metrics (latency, error rate, RPS) and distributed-trace headers, all without app instrumentation.
Players

Pick by Operational Cost

MeshApproachNotes
IstioEnvoy sidecars (or Ambient mode without sidecars)Most features; biggest footprint; Ambient mode dropped per-pod sidecars in 2024.
LinkerdTiny Rust micro-proxy sidecarSimpler, lighter, faster to learn; smaller feature surface.
Cilium Service MesheBPF in the kernel — no sidecar at allLower overhead, fewer pods, but ties you to Cilium CNI.
Consul ConnectEnvoy sidecars driven by Consul service catalogStrong outside K8s — VMs, multi-DC.
AWS App Mesh / GCP AnthosCloud-managed EnvoyLess independent operation, more vendor surface.
Tradeoffs

Don't Adopt Reflexively

  • Cost is real. Sidecars double pod count, add latency, and burn CPU/memory. Ambient/eBPF modes help but bring their own complexity.
  • Debugging gets harder. A 503 might come from your app, the sidecar, the destination's sidecar, or a policy.
  • Upgrades are scary — mesh CRDs evolve; Envoy/Istio rev quickly.
  • Often unneeded. If you have <20 services, a couple of NetworkPolicies and OpenTelemetry SDKs cover most of the win.
  • Gateway API is now the standard for north-south traffic; some mesh features are migrating there.
Continue

Other DevOps & CI/CD Tools