A distributed sorted map: rows partitioned by hash key, columns sorted within the partition. Built to absorb writes at firehose speed and serve them back by partition + range. The internet's time-series and feed engine.
← Back to Database SideONE / QUORUM / ALL on reads and writes. CAP, made dial-able.The reference open-source wide-column DB. Masterless, multi-DC, CQL query language. Powers Discord, Apple, Netflix at scale.
Cassandra-compatible, rewritten in C++ on a shard-per-core architecture. Same model, several × the throughput per node.
The Bigtable-on-Hadoop one. HDFS underneath, master-based architecture. Common in batch + analytics stacks.
The original. Powers Search, Analytics, Maps. Available as a managed service on GCP.
Managed, Cassandra-compatible. Serverless pricing.
Distributed SQL with a Cassandra-compatible YCQL API — wide-column semantics with optional ACID.
Telemetry, IoT readings, click streams, audit logs, chat messages. Millions of writes per second, almost no updates. The LSM-tree thrives on this — sequential writes, no in-place updates, compaction in the background.
"Last 100 messages in this channel." "All sensor readings for device 42 in the last hour." Partition key = entity, clustering key = timestamp DESC. One disk seek, one sequential read. This is the killer pattern.
Cassandra was designed for it. Writes accepted in any datacenter, replicated asynchronously, conflicts resolved by last-write-wins (or CRDTs in newer versions). Geographic latency stays local; eventual convergence across regions.
Add a node, the cluster rebalances, throughput goes up roughly linearly. No master, no sharding ceremony, no read replicas to wire up — the model is "more nodes = more capacity" all the way to thousands.
WHERE on arbitrary columns, secondary indexes are weak. If you don't know the access pattern up front, you'll regret picking this.High cardinality, even access. user_id usually fine. country_code is a hot-shard machine — a few countries dominate. (user_id, day_bucket) avoids unbounded partitions for users who generate millions of events.
Whatever the read needs in order — timestamp DESC, message ID, score. The clustering key controls on-disk layout, so a range scan is one seek + sequential read.
Need messages by channel and by author? Two tables, both written to. Denormalization is the design — storage is cheap, reads are the expensive thing. Use logged batches or change-data-capture to keep them in sync.
Rule of thumb: keep partitions under ~100MB / 100k rows. Bucket time-series by day or hour: (sensor_id, yyyymmdd). Otherwise repair, compaction, and read latency all suffer.
ALLOW FILTERING scan across the cluster. Don't.DELETE.