One tenant's misbehaving webhook shouldn't tank the system. Add isolation, rate limits per customer, and resource fairness. By the end, one customer's failures won't starve others.
← Back to Module 08 overviewquotas table: max events/sec, max concurrent deliveries, max DLQ size per tenant.POST /events rejects with 429 if tenant quota exceeded.GET /quotas/:tenant shows usage and limits.Update src/db/migrations.ts:
ALTER TABLE webhooks ADD COLUMN tenant_id UUID NOT NULL DEFAULT gen_random_uuid(); ALTER TABLE events ADD COLUMN tenant_id UUID NOT NULL DEFAULT gen_random_uuid(); CREATE TABLE IF NOT EXISTS quotas ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL UNIQUE, max_events_per_sec INT DEFAULT 100, max_concurrent_deliveries INT DEFAULT 50, max_dlq_size INT DEFAULT 100, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW() );
Update src/routes/events.ts:
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';
// Create per-tenant rate limiter
const redis = new Redis({ url: process.env.REDIS_URL });
router.post('/', async (req: Request, res: Response) => {
const { type, payload, tenant_id } = req.body;
// Get tenant quota
const quotaResult = await pool.query(
'SELECT * FROM quotas WHERE tenant_id = $1',
[tenant_id]
);
if (quotaResult.rows.length === 0) {
return res.status(404).json({ error: 'Tenant not found' });
}
const quota = quotaResult.rows[0];
const ratelimit = new Ratelimit({
redis,
limiter: Ratelimit.slidingWindow(quota.max_events_per_sec, '1 s'),
analytics: true,
prefix: `tenant:${tenant_id}:events`
});
const { success, pending, limit, reset, remaining } = await ratelimit.limit(tenant_id);
if (!success) {
return res.status(429).json({
error: 'Rate limit exceeded',
limit,
remaining: 0,
reset
});
}
// ... rest of event creation with tenant_id
Create src/routes/quotas.ts:
import { Router, Request, Response } from 'express';
import pool from '../db/pool';
const router = Router();
// GET /quotas/:tenant_id
router.get('/:tenant_id', async (req: Request, res: Response) => {
try {
const quotaResult = await pool.query(
'SELECT * FROM quotas WHERE tenant_id = $1',
[req.params.tenant_id]
);
if (quotaResult.rows.length === 0) {
return res.status(404).json({ error: 'Tenant quota not found' });
}
const quota = quotaResult.rows[0];
// Get current usage
const usageResult = await pool.query(`
SELECT
COUNT(DISTINCT e.id) as events_sent,
COUNT(CASE WHEN d.status = 'processing' THEN 1 END) as concurrent_deliveries,
COUNT(CASE WHEN dlq.id IS NOT NULL THEN 1 END) as dlq_count
FROM events e
LEFT JOIN deliveries d ON e.id = d.event_id AND d.status = 'processing'
LEFT JOIN dead_letter_queue dlq ON e.tenant_id = dlq.id
WHERE e.tenant_id = $1
`, [req.params.tenant_id]);
const usage = usageResult.rows[0];
res.json({
tenant_id: quota.tenant_id,
limits: {
max_events_per_sec: quota.max_events_per_sec,
max_concurrent_deliveries: quota.max_concurrent_deliveries,
max_dlq_size: quota.max_dlq_size
},
current_usage: {
events_sent: parseInt(usage.events_sent),
concurrent_deliveries: parseInt(usage.concurrent_deliveries),
dlq_count: parseInt(usage.dlq_count)
},
remaining: {
events: quota.max_events_per_sec - parseInt(usage.events_sent),
deliveries: quota.max_concurrent_deliveries - parseInt(usage.concurrent_deliveries),
dlq: quota.max_dlq_size - parseInt(usage.dlq_count)
}
});
} catch (err) {
console.error(err);
res.status(500).json({ error: 'Failed to fetch quota' });
}
});
export default router;
curl http://localhost:3000/quotas/:tenant_id returns quota and usage.Create src/__tests__/tenant-isolation.test.ts:
describe('Tenant Isolation', () => {
test('slow webhook of tenant A does not block tenant B', async () => {
// Create two tenants
const tenantA = crypto.randomUUID();
const tenantB = crypto.randomUUID();
// Tenant A has a slow webhook (30s response time)
// Tenant B has a fast webhook (100ms response time)
// Emit 100 events for each tenant
// Measure time for Tenant B's deliveries to complete
// Should be fast (not blocked by Tenant A's slow webhook)
const startB = Date.now();
// Emit events for Tenant B
const endB = Date.now();
expect(endB - startB).toBeLessThan(5000); // Should complete quickly
});
});
Update docs/design.md:
## Resource Isolation ### Per-Tenant Quotas - Max events/sec: prevents one tenant from flooding the system - Max concurrent deliveries: limits thread pool per tenant - Max DLQ size: prevents one tenant's failures from filling disk ### Implementation - Rate limiting via Upstash (Redis-backed) - Sliding window: 1-second buckets - Returns 429 (Too Many Requests) when exceeded - Retry-After header indicates when quota resets ### Monitoring - Metrics per tenant_id: events/sec, delivery rate, error rate - Alert if any tenant exceeds 80% of their quota - Alert if any tenant's DLQ grows beyond 50% of limit
git add -A git commit -m "feat: add multi-tenant support with rate limiting and isolation - Tenant ID in webhooks, events, deliveries - Quotas table: max events/sec, concurrent deliveries, DLQ size per tenant - Rate limiting via Ratelimit (Upstash Redis) - GET /quotas endpoint shows usage vs limits - POST /events returns 429 when tenant quota exceeded - Isolation: one tenant's failures don't affect others - Chaos test verifies isolation" git push origin main
git log --oneline shows your commit.Your system now isolates tenants properly. Next, you'll harden security: HMAC signing, SSRF prevention, and threat modeling. Head to Module 09.