Test Types Deep Dive · 7 of 8

Regression Testing — Don't Re-Break What You Fixed

Regression testing is the practice — not so much a separate kind of test as a discipline — of verifying that previously working behavior still works. Every bug fix becomes a permanent test. Every refactor is checked against the suite. Snapshot tests, visual regressions, golden-file tests are all tools in this category. The goal is simple: don't ship the same bug twice.

Bug FixesSnapshotsVisual RegressionGolden FilesBackstop
← Back to Testing
Quick Facts

What Regression Testing Is

Basic Concepts

  • It's a property of the suite, not a separate test type. Unit, integration, and E2E tests are regression tests when they catch reintroduced bugs.
  • The rule: every bug gets a test. The bug fix is "make this failing test pass."
  • Snapshot/golden-file tests are a regression-focused style — capture the current output, fail on any diff. Useful and overused in roughly equal measure.
  • Visual regression is the same idea for UI: capture a screenshot, fail on pixel diff.
  • The suite grows with the codebase. A 5-year-old project has thousands of tests, most of them backstops for past bugs. That's healthy.
The Discipline

Every Bug Becomes a Test

The Workflow

Reproduce the bug as a failing test, in the lowest tier where you can. Make the test pass with a code change. Merge both — the test stays in the suite forever, ensuring the bug can't return.

Why it works: a bug that escaped to production was, by definition, in a code path no test covered. Adding the test covers that path. The next refactor that would have broken it again now fails in CI instead of in production.

Reference the Issue

Name the test or comment it with the bug ID — // Regression for PROD-1234: total miscalculated when discount is exactly 100%. Six months later, when someone wonders why a particular edge case is asserted, the comment answers them in one line.

Push the Test Down the Pyramid

A bug found in production was reported as an E2E failure. The fix and the test should usually live at the lowest tier that can reproduce it — the unit or integration level — not as another slow E2E. Otherwise the suite ages into the "ice cream cone" anti-pattern.

Specific Tools

Snapshots, Goldens, and Visual Diff

Snapshot Tests

The first run records the output. Subsequent runs compare. A diff fails the test until someone reviews and updates the snapshot. Jest popularized this for React component output; supported across many ecosystems.

The trap: "the test fails, run --update" becomes muscle memory. Snapshots get blindly accepted, the regression-protection value approaches zero. Use them sparingly — for output that's complex, hard to assert by hand, and rarely changes — and review snapshot diffs in PRs as carefully as code.

Golden / Approval Tests

Same idea, more deliberate. The "approved" output lives in a separate file checked into source control. Tools: Approvaltests.com (most languages), insta (Rust). Used for compiler output, generated reports, query results, anywhere you have a complex correct answer that's expensive to assert by hand.

Visual Regression Testing

Capture a screenshot of a rendered UI; compare with the baseline. Catches CSS breakages, layout shifts, accessibility regressions that no functional test would notice. Tools: Chromatic, Percy, Applitools, Playwright's built-in screenshot diffing, BackstopJS.

Caveats: font-rendering differences across OSes cause spurious fails — run from a fixed environment (a CI container with consistent fonts). Animations, dynamic content, dates need to be masked or frozen.

Property-Based Testing as Regression Insurance

Tools like QuickCheck, Hypothesis, fast-check generate hundreds of inputs and look for failing ones. When they find a counter-example, they minimize it and store it as a regression case. The bug stays in the suite; even if the property is later relaxed, the historical failures still catch attempts to re-introduce them.

Where It Goes Wrong

Common Mistakes

  • "Regression run" as a separate, manual cycle. If running the regression suite is a quarterly event by a QA team, you're inheriting the failures of decades-old waterfall practice. Regression should run on every PR.
  • Tests that lock down implementation, not behavior. A regression test that fails on a refactor that didn't change behavior is just friction. Test behavior; let the implementation change.
  • Snapshot tsunami. Hundreds of snapshots get updated in one PR review nobody reads. The test stops protecting anything. Cap snapshot use; review diffs.
  • No quarantine path for flakes. A regression suite with a 5% flake rate teaches developers to ignore failures. Investigate and fix; don't disable.
  • Forgetting the old bug after fixing. If you fix a bug without adding the test, you've learned nothing as a system. Make the test mandatory in the bug-fix definition of done.
In Practice

What Healthy Looks Like

The suite catches regressions before they ship more than half the time. Most bug fixes land with a test. The suite runs on every PR, quickly enough that nobody resents it. Old tests are refreshed when they go stale, deleted when the behavior they protect is no longer relevant — but not before. The suite size grows roughly linearly with the codebase, not exponentially with snapshots.

Pair with the rest: unit for the cheapest regressions, integration for wiring regressions, E2E for the production-only ones, smoke for the deploy-time ones.

Continue

Other Test Types