Kafka's big idea: model the world as an append-only, partitioned, durable log. Producers append; consumers read at their own pace and remember their own offset. The same stream is consumed by ten different services for ten different purposes — today and next year. The log is the truth; everything else is a projection.
← Back to APIs & NetworkingBug fix in the order-totals projection? Rewind that consumer to the start of the topic and replay. New analytics service launching? Read the topic from the beginning to backfill its model. With a queue, the message is gone the moment it's consumed; with a log, every consumer can rewind independently.
The same OrderPlaced stream feeds the warehouse, billing, fraud detection, recommendations training, the analytics warehouse, and the customer-email service. Each consumer keeps its own offset; the producer never knew. Add a new consumer next quarter without touching anything that exists today.
Append-only writes are sequential — disks love that. Kafka regularly handles millions of messages per second per cluster. Partitioning fans the load across brokers; the only limit on read throughput is "did you give the consumer group enough partitions?"
Once events are on a stream, you can transform them in flight: filter, join, aggregate over windows, materialize into a search index or cache. Kafka Streams, Apache Flink, ksqlDB, Spark Structured Streaming. This is how real-time fraud detection, leaderboards, and ML feature pipelines get built.
Order is only guaranteed within a partition. Pick a partition key that groups events that must stay in order — usually the entity ID (user_id, order_id, account_id). Same key → same partition → same consumer → same order.
Pick badly and you get hot partitions: one consumer pinned at 100% while the others idle. Picking too coarse-grained a key (customer_country) is the classic trap.
Idempotency is your friend. Process by a unique event_id and ignore duplicates — cheaper than chasing exactly-once.
Events outlive the code that wrote them. Use a schema (Avro, Protobuf, JSON Schema) and a registry (Confluent Schema Registry, Apicurio, Buf Schema Registry) so producers and consumers can evolve independently.
Rule of thumb: only add optional fields, never repurpose one. Enforce backward and forward compatibility in CI.
How do you write to your DB and publish an event without one of them silently failing? Write the event to an outbox table in the same DB transaction; a separate process (or Debezium-style CDC) reads the table and publishes to Kafka. Atomic on the DB side, eventually consistent on the bus side. Essential whenever a state change must produce an event.
A regular topic retains by time. A compacted topic retains by key — the last value for each key is kept forever, older values for the same key are eventually deleted. Lets the topic act as a permanent "current state" view: replaying gives you the latest of everything. Used for config distribution, slowly-changing dimensions, and stateful stream processors.
Stream every row change in a database (Postgres, MySQL, MongoDB) onto a Kafka topic. Tools: Debezium, AWS DMS. Lets downstream services react to DB changes without modifying the source app, and is the core of "the database is just another producer" architectures.
| Broker | Notes |
|---|---|
| Apache Kafka | The de-facto standard. Self-managed, or via Confluent Cloud, AWS MSK, Aiven, Redpanda (Kafka-compatible). |
| AWS Kinesis Data Streams | Managed, Kafka-like. Tighter AWS integration, simpler ops, lower throughput per shard. |
| Apache Pulsar | Log + queue hybrid. Multi-tenant from day one; separates compute (brokers) from storage (BookKeeper); native geo-replication. |
| Google Pub/Sub | Managed pub/sub with persistence; less of a "log" model but covers many streaming use cases on GCP. |
| Azure Event Hubs | Kafka-protocol compatible managed service on Azure. |
| Redpanda | Kafka-API-compatible, written in C++; lower-latency, simpler operationally. |
Pick event streaming when you need durable, ordered, replayable history of business events with multiple independent consumers — the substrate for event-driven architectures, analytics pipelines, audit trails, and CDC.
Don't reach for Kafka for plain background jobs (a queue is simpler) or for one-to-one fire-and-forget messages (use SQS, Service Bus, RabbitMQ). Streaming earns its operational weight when replay and many-consumers-on-one-stream are real requirements, not aspirational ones.