Records are JSON (or BSON) — nested, variable, self-describing. No fixed schema, no joins by design. Each document holds the whole entity the application reads, so one read returns one object.
← Back to Database SideThe default document database. Rich query language, aggregation pipeline, multi-document transactions since 4.0. Atlas runs it managed.
Document + key-value with SQL-like N1QL queries. Strong on caching workloads thanks to its memcached lineage.
Google's serverless document store with real-time listeners — built for mobile/web clients that want live data sync.
AWS' managed key-document hybrid. Predictable single-digit-ms latency, but you must design around the access pattern up front.
Azure's multi-model. Mongo API, SQL API, and others — pick the wire protocol that fits.
Niche choices: RavenDB for .NET-first shops; ArangoDB if you want documents and graph in one engine.
A product with variants, options, images, and translated descriptions. A blog post with embedded comments and tags. A medical record with sections and history. Stuffing these into 8 relational tables and joining them back together every read is work the application doesn't get paid for. One document, one read.
Early-stage products, integrations with third parties whose payloads keep evolving, multi-tenant systems where each tenant has its own custom fields. Adding a field is just writing it; old documents simply don't have it. Migration is a backfill script, not an ALTER TABLE on a 200M-row table.
If 95% of your reads are "give me this user's profile" or "give me this order with everything in it," documents are exactly that — one fetch, no assembly. Where they struggle is the 5% that wants to slice across documents (revenue by region, top customers by month).
Most document stores were built for sharding from day one. Globally distributed Cosmos DB, Firestore multi-region, DynamoDB Global Tables — the data follows the user. Distributed SQL can do this too, but it's been retrofitted; document stores assume it.
Embed when the child belongs to one parent and is read with it (order line items, blog comments on a small post). Reference when the child is shared (a product referenced by many orders) or unbounded (every comment on a Reddit thread). The 16MB Mongo document cap is a real ceiling — model around it.
In SQL you model the data and the query optimizer figures it out. In document stores you model the read. If two queries want different shapes of the same data, store both shapes — the duplication is the point. Maintain consistency with change streams or transactional outbox.
Mongo's $jsonSchema validator, Firestore security rules, DynamoDB attribute conditions. Schema-flexible doesn't mean schema-absent; it means the schema lives where you choose. Validate at the database boundary, not just the app, so jobs and migrations can't write garbage.
Pick a key with high cardinality and even access. tenant_id works only if tenants are roughly the same size — one whale tenant becomes a hot shard. Composite keys (tenant_id + user_id) usually distribute better. Once chosen, the key is hard to change.