Tables, rows, and a strict schema. ACID transactions, foreign keys, and a query language older than most engineers using it. The default for systems where the data has a shape and the truth has to be one truth.
← Back to Database SideThe modern default. Standards-compliant, extensible (PostGIS, pgvector), excellent JSONB. Use it unless you have a specific reason not to.
Ubiquitous, fast, simple. Behind much of the LAMP-era web. Watch defaults — older versions had loose ones.
Microsoft's enterprise stack. Tight Azure / .NET integration, strong tooling, T-SQL dialect.
Long-standing enterprise heavyweight. Powerful, expensive, and entrenched in finance & telco.
An entire database in a single file linked into your app. Phones, browsers, edge — and increasingly, small servers.
Distributed SQL — relational semantics across regions. Pay for it in latency.
Money never disappears, an order has a customer, a seat isn't sold twice. Foreign keys, CHECK constraints, unique indexes, and serializable transactions enforce these in the database — not hopefully in the app. If two services touch the same data, that enforcement is the only one that holds.
SQL lets you ask new questions of old data: "revenue by cohort by month, but only for users who churned." No re-modeling, no rebuild — write the query. NoSQL stores typically answer fast only the questions you designed for; SQL answers most questions slowly and the right ones quickly (with the right index).
Orders → customers → addresses → countries. Done in one round-trip with a JOIN. The alternative — fetch, then fetch, then fetch — is the N+1 problem in disguise, and what you end up reinventing in document databases.
Every analyst, BI tool, and dashboard speaks SQL. Choosing relational means your data is queryable by people who don't write code — a multiplier most teams underestimate.
3NF is usually the goal. Each non-key column depends on the key, the whole key, and nothing but the key. Higher forms (BCNF, 4NF) matter rarely; lower forms cause update anomalies daily.
Denormalize on purpose — for read-heavy paths, or to avoid joins on enormous tables. Document the duplication so the next person doesn't trust it as authoritative.
Postgres JSONB, MySQL JSON, SQL Server JSON: store semi-structured data in a relational column when the shape is genuinely variable (settings blobs, third-party payloads). Index specific paths with expression indexes. Don't use it as an excuse to skip schema design — once you're querying inside the JSON, you're worse off than if you'd just made columns.
Three options: single table with nullable columns (simple, sparse), table per type (clean, joins to read), shared table + type-specific tables (most flexible, most code). Pick by query patterns — most apps overengineer this.
(tenant_id, created_at) ≠ (created_at, tenant_id).WHERE lower(email) = … defeats a normal index — use a functional index or store normalized.Sequential scan on a million-row table is the warning sign. Nested loop with millions of iterations is another. Most slow queries have one of three causes: missing index, bad statistics, or a query that fundamentally needs to read too much data.
Postgres wants fewer connections than developers expect — usually 2–4× CPU cores on the DB server. Put pgBouncer or RDS Proxy in front so thousands of app connections share a small real pool. App-side pool size should account for instance count: app_pool × instances ≤ db_max.
Read replicas scale reads cheaply; mind replication lag — don't read your own write from a replica. Sharding (split by key) is the last resort: it gives up cross-shard joins and transactions. Consider distributed SQL (Spanner, CockroachDB) before sharding manually.
CREATE INDEX CONCURRENTLY; add columns without defaults, then backfill in batches.WHERE deleted_at IS NULL forever. Use them deliberately.timestamptz in UTC; convert at the edge.