Concurrency is structuring a program so multiple things make progress at the same time. Parallelism is actually running them on multiple cores. They sound the same and aren't. Get the model right and you ship a fast, correct service. Get it wrong and you ship Heisenbugs that vanish under a debugger.
← Back to Foundations| Model | Languages | How It Works |
|---|---|---|
| OS threads + locks | C, C++, Java, C#, Rust | Kernel-scheduled threads sharing memory. Synchronize with mutexes, condition variables, atomics. |
| Event loop / single-threaded async | JavaScript (Node, browser) | One thread, callback queue. Non-blocking I/O wakes callbacks. async/await hides the queue. |
| Coroutines / async runtime | Python (asyncio), Rust (tokio), C# (Task), Kotlin | Cooperative tasks scheduled on a small thread pool. Awaiting yields control. |
| Goroutines + channels | Go | Cheap green threads (M:N scheduler). Communicate by sending values through channels, not by sharing memory. |
| Actors | Erlang, Elixir, Akka | Isolated processes that exchange messages. No shared memory. Failure isolation built in. |
| Software transactional memory | Clojure, Haskell | Read/write inside a transaction; runtime retries on conflict. Composable, but rare in practice. |
Waiting on the network, the disk, a database. The CPU is idle 99% of the time. Async/event-loop wins: one thread can juggle thousands of in-flight requests because each one's "work" is mostly waiting. Threads work too, but you pay per-thread memory (1–8MB stacks) for nothing.
Crunching numbers, image processing, ML inference. You want actual parallelism — one thread per core. Async doesn't help; it just shuffles work on the same core. In Python, async is useless here and the GIL prevents threads from helping — use multiprocessing, native extensions, or a different language.
Most real services. Use async for the I/O hot path; offload CPU spikes to a thread pool, worker process, or external queue. Don't run a CPU-heavy task inside the event loop — it blocks every other request waiting behind it.
Two threads read-modify-write the same memory and the result depends on who got there first. counter++ is not atomic — it's load, add, store. With two threads and no lock you lose updates. Fix: a mutex, an atomic, or a single-owner pattern (channels, actors).
Thread A holds lock 1 and waits for lock 2. Thread B holds lock 2 and waits for lock 1. Nobody moves. Prevent by: acquiring locks in a fixed global order, using try-lock with timeouts, or eliminating shared locks with message passing.
Livelock: threads keep changing state in response to each other but make no progress (two people stepping the same way in a hallway). Starvation: a thread never gets the resource because higher-priority ones keep cutting in. Fairness in lock implementations matters here.
Without a memory barrier, one thread's write may never become visible to another — the CPU and compiler reorder freely. Languages define a memory model (Java JMM, C++11, Go) telling you what's guaranteed. Volatile/atomic/locked operations issue the right barriers; plain reads/writes don't.
time.sleep or sync DB call freezes every other coroutine on that loop.| Primitive | What It Does | When to Use |
|---|---|---|
| Mutex / Lock | One holder at a time. | Protect a critical section. Default choice. |
| Read-write lock | Many readers OR one writer. | Read-heavy workloads with infrequent writes. |
| Semaphore | Lock with N permits. | Bound concurrency (e.g., "max 50 in-flight requests"). |
| Condition variable | Wait for a predicate, signal others. | Producer/consumer queues, custom sync. |
| Atomic | Lock-free single-word op (CAS, increment). | Counters, flags, lock-free structures. |
| Channel | Typed queue between goroutines/tasks. | Hand off ownership; "share by communicating." |
| Future / Promise / Task | Handle for an async result. | Composing async work; await consumes it. |
| Barrier / Latch | Wait until N parties arrive. | Bulk-synchronous parallel phases, fan-in. |
Read-only data is automatically thread-safe. Most concurrency bugs evaporate when you stop sharing mutable state. Pass copies, use persistent data structures, treat messages as values.
A lock held during a network call is a queue with one server. Compute-then-lock-then-update is the pattern. If a critical section is more than ~10 lines, it's probably wrong.
Every language has battle-tested concurrent collections (ConcurrentHashMap, sync.Map, Arc<Mutex<T>>, channel libraries). Custom lock-free code looks fast in microbenchmarks and ships subtle bugs that show up at 3am.
Go's -race, ThreadSanitizer (C/C++/Rust), Java Flight Recorder. Single-threaded tests will never find a race condition; you have to deliberately create contention. Property-based and chaos tests are worth the setup for anything concurrent.