Event Streaming · Messaging & Event Streaming Deep Dive

Quick Facts

What Streaming Is

Basic Concepts

Topic: a named, ordered, append-only log. Producers write to the end; consumers read from any offset.
Partition: a topic is split into N partitions for parallelism. Order is guaranteed within a partition, not across.
Offset: the position of a message in a partition. Consumers track their own offset; the broker doesn't know who's read what.
Consumer group: a set of consumers that share a topic — each partition goes to exactly one member of the group. Add consumers (up to partition count) to scale read throughput.
Retention: messages stay for hours, days, or forever. Unlike a queue, reading doesn't consume.
Replay: reset a consumer's offset to the past and reprocess. The defining feature.

Why It Wins

What Streaming Buys You

Replayability

Bug fix in the order-totals projection? Rewind that consumer to the start of the topic and replay. New analytics service launching? Read the topic from the beginning to backfill its model. With a queue, the message is gone the moment it's consumed; with a log, every consumer can rewind independently.

Many Consumers, No Coordination

The same OrderPlaced stream feeds the warehouse, billing, fraud detection, recommendations training, the analytics warehouse, and the customer-email service. Each consumer keeps its own offset; the producer never knew. Add a new consumer next quarter without touching anything that exists today.

Massive Throughput

Append-only writes are sequential — disks love that. Kafka regularly handles millions of messages per second per cluster. Partitioning fans the load across brokers; the only limit on read throughput is "did you give the consumer group enough partitions?"

Stream Processing

Once events are on a stream, you can transform them in flight: filter, join, aggregate over windows, materialize into a search index or cache. Kafka Streams, Apache Flink, ksqlDB, Spark Structured Streaming. This is how real-time fraud detection, leaderboards, and ML feature pipelines get built.

Mechanics

The Patterns You Need

Partitioning Keys

Order is only guaranteed within a partition. Pick a partition key that groups events that must stay in order — usually the entity ID (user_id, order_id, account_id). Same key → same partition → same consumer → same order.

Pick badly and you get hot partitions: one consumer pinned at 100% while the others idle. Picking too coarse-grained a key (customer_country) is the classic trap.

Delivery Semantics

At-most-once: commit offset before processing. Lossy. Used for non-critical metrics.
At-least-once: process then commit. Duplicates possible — the realistic default.
Exactly-once (within Kafka): Kafka transactions guarantee a producer can write to multiple topics atomically with offset commits. End-to-end exactly-once across external systems is much harder.

Idempotency is your friend. Process by a unique event_id and ignore duplicates — cheaper than chasing exactly-once.

Schema Evolution

Events outlive the code that wrote them. Use a schema (Avro, Protobuf, JSON Schema) and a registry (Confluent Schema Registry, Apicurio, Buf Schema Registry) so producers and consumers can evolve independently.

Rule of thumb: only add optional fields, never repurpose one. Enforce backward and forward compatibility in CI.

The Outbox Pattern

How do you write to your DB and publish an event without one of them silently failing? Write the event to an outbox table in the same DB transaction; a separate process (or Debezium-style CDC) reads the table and publishes to Kafka. Atomic on the DB side, eventually consistent on the bus side. Essential whenever a state change must produce an event.

Compacted Topics

A regular topic retains by time. A compacted topic retains by key — the last value for each key is kept forever, older values for the same key are eventually deleted. Lets the topic act as a permanent "current state" view: replaying gives you the latest of everything. Used for config distribution, slowly-changing dimensions, and stateful stream processors.

Change Data Capture (CDC)

Stream every row change in a database (Postgres, MySQL, MongoDB) onto a Kafka topic. Tools: Debezium, AWS DMS. Lets downstream services react to DB changes without modifying the source app, and is the core of "the database is just another producer" architectures.

Players

The Common Streaming Brokers

Broker	Notes
Apache Kafka	The de-facto standard. Self-managed, or via Confluent Cloud, AWS MSK, Aiven, Redpanda (Kafka-compatible).
AWS Kinesis Data Streams	Managed, Kafka-like. Tighter AWS integration, simpler ops, lower throughput per shard.
Apache Pulsar	Log + queue hybrid. Multi-tenant from day one; separates compute (brokers) from storage (BookKeeper); native geo-replication.
Google Pub/Sub	Managed pub/sub with persistence; less of a "log" model but covers many streaming use cases on GCP.
Azure Event Hubs	Kafka-protocol compatible managed service on Azure.
Redpanda	Kafka-API-compatible, written in C++; lower-latency, simpler operationally.

Where It Hurts

The Operational Tax

Operational weight. Self-managed Kafka is three brokers minimum, ZooKeeper or KRaft, partition planning, retention tuning, ACLs, mTLS, monitoring. Use a managed service unless you're sure.
Capacity planning by partitions. Partitions are easy to add but hard to remove; pick generously up front but not absurdly.
Eventual consistency everywhere. Downstream views lag the log by milliseconds to seconds. Design UIs and APIs for staleness.
Schema discipline is non-optional. One team adds a field, another team's consumer crashes at 3 a.m. Compatibility checks in CI prevent this.
Don't reinvent a database. Streams are not stores you query randomly. Project into a database the consumer owns.

Decision

When to Pick Streaming

Pick event streaming when you need durable, ordered, replayable history of business events with multiple independent consumers — the substrate for event-driven architectures, analytics pipelines, audit trails, and CDC.

Don't reach for Kafka for plain background jobs (a queue is simpler) or for one-to-one fire-and-forget messages (use SQS, Service Bus, RabbitMQ). Streaming earns its operational weight when replay and many-consumers-on-one-stream are real requirements, not aspirational ones.

Continue

More on Messaging

← Message Queues Pub/Sub → ↑ Back to APIs & Networking