Feature-Flag Service Track · Learning Tracks

How to Use This Track

Learning by Shipping Feature Control Systems

Ground rules

Think like an SDK user. Every decision you make affects thousands of client apps. Prioritize simplicity and backwards compatibility.
Performance matters. Flag evaluation is on the hot path. Caching, pre-computing, and efficient algorithms are non-negotiable.
Embrace real-time distribution. WebSockets or polling: pick one and build it properly. Flag changes can't take hours to propagate.
Pick a stack and stick with it. Node/TypeScript, Python/FastAPI, or Go are all good choices.
Test the rule engine extensively. Off-by-one errors in targeting can segment your user base wrong.

The ten modules

MODULE 01

Foundations & Setup

Repo, schema, design doc.

MODULE 02

Core API & Evaluation

REST endpoints, flag evaluation logic.

MODULE 03

Advanced Targeting & Rules

Segments, operators, rule priority.

MODULE 04

SDK Integration & Clients

SDKs, caching, offline support.

MODULE 05

Testing Flag Logic

Unit, integration, edge cases.

MODULE 06

Caching & Performance

Redis caching, high throughput.

MODULE 07

Observability & Monitoring

Logs, metrics, dashboards.

MODULE 08

Admin Dashboard & UI

Flag management, rule builder.

MODULE 09

Security & Compliance

Auth, RBAC, audit logging.

MODULE 10

Capstone & Production

Deploy, CI/CD, runbooks, docs.

Module 01 · ~2–3 hrs

Foundations & Project Setup

Start with a clear design. Feature flags are simple — wrong rule evaluation is not. Get the model right from the start.

Tasks

Create a Git repo with standard config (.editorconfig, .gitignore, linter/formatter).
Scaffold an HTTP server in your chosen language.
Write a one-page design doc: problem, goals, flag model (simple boolean? rules? segments?), API sketch, evaluation performance targets.
Create database schema: flags (metadata), rules (targeting), segments (user groups).
Document your rule evaluation strategy in the design doc.

Acceptance criteria

git clone + one command spins up the server.
Design doc committed to /docs/design.md with schema sketch.
Linter passes. Pre-commit hook blocks failures.

Deep dives: Documentation & Writing · Problem Decomposition · Schema Design

Module 02 · ~4–6 hrs

Core API & Data Model

The contract: manage flags and evaluate them. Start simple — a flag has a name, enabled state, and variations.

Tasks

Implement POST /flags to create a flag: name, type (boolean/string), enabled, variations, description.
Implement POST /evaluate to evaluate flags: user context (ID, attributes), flag key. Return the value/variation.
Implement GET /flags/:key, PATCH /flags/:key, DELETE /flags/:key.
Validate input: flag names, context structure, variation format.
Document the user context schema with examples (user ID, email, custom attributes).

Acceptance criteria

curl -X POST /flags creates a flag; GET /flags/:key returns it.
curl -X POST /evaluate with a user context returns a flag value.
Updating a flag is reflected immediately in evaluations.

Deep dives: REST API Design · Relational Databases

Module 03 · ~5–7 hrs

Rule Evaluation Engine

The core: evaluate complex rules (if-then logic, segments, attributes) to decide flag variations.

Tasks

Implement a rule model: conditions (AND, OR), operators (equals, in, contains, regex), actions (target variation).
Implement segment support: define a segment by rules (e.g., "users in US and signed up this month"), use segments in flag rules.
Implement percentage-based targeting: send variation B to X% of users (deterministic by user ID hash).
Evaluate flags deterministically: same user + flag = same result every time (use consistent hashing for percentages).
Discuss in design doc: rule precedence, conflict resolution, fall-back behavior.

Acceptance criteria

A flag with rules: create segments, assign variations based on segments + attributes.
Evaluate a flag for user A 100 times → same result every time.
Percentage targeting: 1000 evaluations → roughly 50% of each variation (for 50% rollout).

Deep dives: Distributed Systems · Caching

Module 04 · ~5–7 hrs

Client SDK & Integration

SDKs are the interface. Make them simple to use, hard to misuse, and blazingly fast.

Tasks

Build a server-side SDK (or multiple: Node, Python, Go). Initialize with the service URL and API key.
SDK exposes client.evaluate(flag_key, user_context) → returns the evaluated variation.
SDK locally caches the flag configuration in memory.
SDK provides is_enabled(flag_key, user_context) for boolean flags and get_variation(flag_key, user_context) for multi-armed.
Write SDK docs: initialization, API, error handling, performance.

Acceptance criteria

SDK initialization takes < 100ms.
Flag evaluation from memory is < 1ms (local cache).
SDK README with working example, error handling guide.

Deep dives: Caching · SDK Design

Module 05 · ~4–6 hrs

Real-Time Config Updates

SDK caches are stale by default. Push new flag definitions to SDKs in real-time so changes are live immediately.

Tasks

Choose a mechanism: WebSockets (push), polling (pull), or Server-Sent Events.
On PATCH /flags/:key, push the updated flag config to all connected SDKs.
SDK subscribes to updates on initialization; receives flag changes in real-time.
Handle network failures: SDK reconnects with exponential backoff if the connection drops.
Document your update latency SLO: e.g., 99% of SDKs see new flags within 5 seconds.

Acceptance criteria

Update a flag; within 5 seconds, SDK cache is fresh and next evaluation uses the new rules.
Disconnect the SDK from the update stream; flag doesn't change until connection is restored.
You can describe the failure mode if the update stream is down for an hour.

Deep dives: Event-Driven Patterns · Monitoring & Observability

Module 06 · ~5–7 hrs

Testing & Correctness

Rule engines are correctness-critical. Off-by-one errors segment users wrong. Test extensively.

Tasks

Unit tests for rule evaluation: all operators (equals, in, regex), segment membership, percentage targeting.
Property-based tests (QuickCheck/Hypothesis): random user contexts + rules should always evaluate consistently.
Integration tests: test SDK + server together. Update flags, verify SDKs see changes.
Edge case tests: empty segments, contradicting rules, malformed attributes, null contexts.
Coverage threshold: 85% for evaluation logic.

Acceptance criteria

Test suite runs in under 2 minutes.
You can describe a testing strategy for the rule engine (especially percentage targeting).
A malformed rule doesn't crash the service (graceful error handling).

Deep dives: Unit Testing · Integration Testing · Property-Based Testing

Module 07 · ~5–7 hrs

Observability & Debugging

Teams need to understand why a flag evaluated to X. Instrument: logs, metrics, traces, evaluation audit logs.

Tasks

Structured logs for every evaluation: flag key, user ID, rules matched, variation returned, timestamp.
Metrics: evaluation count by flag, error rate, cache hit ratio, update latency.
Distributed tracing: instrument the full path (SDK request → server evaluation → cache lookup).
Build a Grafana dashboard: flag evaluation rate, error rate, top flags by volume.
Expose GET /flags/:key/audit to show evaluation results for a user (debugging tool).

Acceptance criteria

You can query logs for all evaluations of flag X in the last hour.
Dashboard shows which flags are being used most; error rates per flag.
GET /flags/my-flag/audit?user_id=123 shows why the user got variation A.

Deep dives: Logging · Metrics · Tracing

Module 08 · ~5–7 hrs

Progressive Rollouts & Canaries

Roll out features to 1%, 10%, 100% without code changes. Schedule rollouts, monitor metrics, auto-rollback if needed.

Tasks

Add rollout_percentage field to flags: 0–100% rollout to users.
Implement scheduled rollouts: flag can transition 1% → 10% → 50% → 100% on a schedule.
Implement guardrails: if error rate (from metrics) exceeds threshold, auto-rollback to 0%.
Expose POST /flags/:key/rollout to manually adjust rollout % with audit logging.
Document your rollout strategy: when to use % vs boolean, how to monitor during rollout.

Acceptance criteria

Create a flag with a 50% rollout; verify exactly 50% of users see the new variation.
Schedule a rollout: 1% at time T, then 50% at time T+1h; verify it happens on schedule.
Manually adjust rollout %; audit log shows who changed it and when.

Deep dives: Deployment Strategies · Metrics & Monitoring

Module 09 · ~4–6 hrs

Security & Compliance

Teams rely on flags to ship safely. Secure it: access control, audit logs, secrets, sensitive data handling.

Tasks

Implement role-based access control: admin (full), editor (create/update flags), viewer (read-only).
Add audit logging: log every flag change, rollout adjustment, user login. Include who, what, when.
Move secrets (API keys, signing keys) to a secrets manager (Vault, AWS Secrets Manager).
Never log sensitive user attributes in evaluation audit logs.
Write a simple threat model: flag injection (customer sees wrong flag)? Unauthorized rollouts?

Acceptance criteria

A read-only user cannot create or modify flags.
Audit log shows every flag change: timestamp, user, old value, new value.
Threat model committed; key risks documented.

Deep dives: Authentication · Authorization · Threat Modeling

Module 10 · ~6–8 hrs

Capstone & Production Readiness

Ship it. Deploy, document SDKs, run operational drills, go live.

Tasks

Write a multi-stage Dockerfile. Use docker-compose.yml for local dev (app + Postgres).
GitHub Actions: lint → test → build → push → deploy.
Deploy to production: Fly.io, Render, Railway, or a VM. Set up health checks and graceful rollback.
Write two runbooks: "a flag isn't updating" and "SDK connection storm."
Publish SDK documentation: GitHub README, code examples, changelog.
Write a migration guide for teams adopting the service from competitors.

Acceptance criteria

Every PR gets a preview URL; main deploys to production automatically.
A bad deploy auto-rolls back.
Runbooks + SDK docs committed; linked from README.

Deep dives: CI/CD · Containers · Operational Excellence

After the Track

Where to Go Next

Stretch goals on the same project: A/B testing framework, experiment analytics, custom attribute types, flag templates, third-party integrations.
Read about systems design. Study large-scale feature-flag deployments; sketch how you'd serve 100M evaluations/sec.
Try the Webhook Delivery track to practice event-driven systems at scale.
Write about it. Blog post per module; specifically on rule evaluation correctness and percentage targeting.

↑ Back to Learning Tracks ↑ Back to Map