Message Queues · Messaging & Event Streaming Deep Dive

Quick Facts

What a Queue Is

Basic Concepts

Producer: the code that enqueues a message.
Consumer (worker): the code that dequeues and processes one message at a time.
Broker: the middleware holding messages — RabbitMQ, ActiveMQ, Amazon SQS, Azure Service Bus, Redis (with care).
Competing consumers: N workers share one queue; each message goes to exactly one of them. Add workers to scale throughput.
Acknowledgement: the worker tells the broker "I'm done." Until then, the message is invisible to other workers but not gone — if the worker dies, the broker re-delivers.
Visibility timeout / lease: while a worker is processing, the message is hidden for a window. Set it longer than your slowest job; if the worker hangs, another worker eventually retries.

Why It Wins

What You Buy

Get Slow Work Off the Web Request

"Sign up" should return in 100ms. Sending the welcome email, generating the avatar, kicking off the analytics event, and warming the recommendations cache should not block that response. Enqueue and return; workers do the slow stuff. The user gets a fast page; you keep the work.

Spike Absorption

Black Friday traffic is 10× normal for two hours. Your DB and web tier can't auto-scale that fast. A queue is a buffer — requests pile up, workers drain at their normal rate, the system degrades into latency rather than collapsing into errors. You trade "instant" for "alive."

Retries Without Custom Code

Worker died mid-job? Broker re-delivers. Email service was down? Worker raises an exception, doesn't ack, the message comes back. Combined with idempotent handlers (see below), you get free resilience.

Decoupling Producer from Consumer

Producer doesn't know how many workers exist or how long they take. Workers can be deployed and scaled independently. The team that owns the producer can ship without coordinating with the team that owns the worker.

Build It Right

The Patterns That Matter

Idempotent Consumers

Queues guarantee at-least-once delivery. Sometimes the worker finishes the job, then crashes before acking — the message comes back, and a second worker runs it again. Make every handler idempotent: store the job's unique ID and skip if already processed, or design the operation so running it twice is harmless (UPDATE order SET status='shipped' WHERE id=42 is naturally idempotent).

Dead-Letter Queues (DLQs)

A message that fails repeatedly — bug in the handler, malformed payload, missing dependency — will block the queue forever if you keep retrying. After N failures, route it to a dead-letter queue for human inspection. Wire alerts on DLQ depth.

Visibility Timeouts

Set the timeout longer than your worst-case job duration plus a margin. Too short, and a slow job triggers a duplicate while it's still running. Too long, and a crashed worker holds messages hostage for ages. Many brokers let workers extend the timeout dynamically as work progresses.

Backoff and Poison-Message Handling

If a job fails because a downstream service is down, retrying immediately doesn't help. Use exponential backoff between retries. After a retry budget, send to DLQ. If 1 message in 10,000 always fails — a malformed customer record from 2009 — let it land in DLQ and don't pretend it's transient.

Don't Use a DB Table as a Queue (Usually)

It's tempting — SELECT FOR UPDATE SKIP LOCKED on a jobs table works, kind of. It's fine at low scale. At higher throughput it bottlenecks the DB, locks become contention hotspots, and you reinvent every queue feature poorly. Use a real broker once you're past "few jobs per second."

The exception: the outbox pattern, where an outbox table inside the same DB transaction as a state change feeds a real broker. That's a queue-as-a-table done right.

Brokers

The Common Players

Broker	Style	Where It Shines
RabbitMQ	AMQP — queues + flexible routing exchanges	Task queues, work distribution, complex routing rules.
Amazon SQS	Managed FIFO or standard queue	"Just give me a queue" on AWS. Pairs naturally with Lambda.
Azure Service Bus	Managed queue + topics	The default on Azure; sessions, transactions, scheduled delivery.
Google Cloud Tasks	HTTP-target task queue	Async invocations of HTTP endpoints; fits Cloud Run / App Engine.
ActiveMQ Artemis	Self-hosted broker	Enterprise on-prem; supports multiple protocols.
Beanstalkd, Sidekiq, Resque	Lightweight task queues (often Redis-backed)	Web app background jobs in Ruby/Python/Node.
Redis Streams / Lists	In-memory, lightweight	Quick wins alongside an existing Redis. Plan for persistence carefully.

Decision

When to Pick a Queue

Pick a queue when one producer hands off work to one consumer (or one of N workers competing for it), and you don't need fan-out, replay, or ordering across the whole stream. Background jobs, image processing, email, integrations, retries against external APIs — all classic queue territory.

Pick Pub/Sub when one event needs to reach many independent consumers. Pick event streaming (Kafka) when you need replay, long retention, and an ordered append-only log as the source of truth.

Continue

More on Messaging

Event Streaming → Pub/Sub → ↑ Back to APIs & Networking