Code Review · AI Workflow Deep Dive

Quick Facts

At a Glance

Basic Concepts

Two flavors: AI inside the IDE (review-as-you-write) and AI on the PR (review-on-push).
Best at: consistency, explanation, easy-to-miss bugs, security smells, accessibility.
Weakest at: business-logic correctness it can't see context for.
Noise is the killer: too many low-value comments and developers stop reading.
Augment, don't replace human review — the human still owns the merge decision.

Landscape

The Major Tools

Tool	Where it runs	Notes
GitHub Copilot Code Review	GitHub PRs	Native to GitHub; comments inline.
CodeRabbit	GitHub / GitLab / Azure	Popular standalone reviewer; configurable depth.
Greptile / Codium PR-Agent	GitHub / GitLab	Repo-aware, uses embeddings of the codebase.
Sourcery / Cursor BugBot	IDE / PR	Lighter-weight, focused on patterns.
Ellipsis	GitHub	AI reviewer + custom rules.
Snyk Code / GitHub Advanced Security	PR + CI	Security-focused (SAST) with AI explanations.
SonarQube + AI CodeFix	CI / IDE	Established static analyzer adding AI suggestions.
Custom (Claude Code / Cursor)	Local / CI script	Roll your own reviewer with prompts + the diff.

Mechanics

What AI Reviewers Catch

High-Value Catches

Null / undefined access the type checker missed.
Off-by-one errors in loops & ranges.
Race conditions in async code (often!).
SQL injection, XSS, SSRF patterns.
Hard-coded secrets & credentials.
Missing error handling on async / I/O.
Inconsistent naming / style vs surrounding code.
Accessibility (a11y) — missing alt text, label-for, focus traps.

Lower-Value (Often Noise)

Style nits a formatter would handle.
"Consider extracting a function" on perfectly clear 8-line blocks.
Generic "add tests" without specifying which.
"Add comments" on self-explanatory code.
Performance speculation without measurement.

Tune the reviewer to suppress these — most tools have severity / category filters.

Where It Genuinely Falls Short

Business-logic correctness — does this discount rule match policy?
Cross-service contracts — the diff doesn't show the consumer.
Architecture & long-term maintainability.
Performance under real load.

Practice

Getting It to Work

Tune Aggressiveness

Default settings tend to be noisy. Configure:

Severity threshold — show only "warning" and above.
Categories — disable "style" if you have a formatter.
Path filters — skip generated code, vendor, fixtures.
Custom rules / prompts — encode your team's conventions.

The Two-Reviewer Workflow

AI reviewer runs on PR open — author addresses obvious issues before pinging humans.
Human reviewer focuses on intent, architecture, business rules.
AI re-runs after each push.

Net result: humans spend less time on nits, more on the things only humans catch.

Custom CI Reviewer (DIY)

If off-the-shelf doesn't fit, a 50-line GitHub Action can do it: send the diff + a custom prompt to Claude / GPT, post structured review comments back via the API. Total control over rules, model, and cost.

# pseudo-code
diff = git_diff(base="main")
review = anthropic.messages(
    model="claude-sonnet-4.6",
    system=OUR_TEAM_REVIEW_GUIDELINES,
    messages=[{"role": "user", "content": diff}],
)
gh.post_review(pr=PR_NUMBER, body=review.text)

Anti-patterns

Auto-merging on AI approval — never. Humans must own merges to production.
Treating AI comments as required — they're suggestions, not blockers.
Reviewing 5,000-line PRs with AI alone — split the PR, then review.

Continue

Other AI Workflow Areas

Code Generation Documentation Debugging Product Features Data & Insights ↑ Back to AI Landscape