Per-tenant rate limits
Different throttles for different customers
In multi-tenant SaaS, you might want:
- Free tier: 100 req/min.
- Paid: 10,000 req/min.
- Enterprise: 100,000 req/min.
Per-tenant rate limits enforce this at your API.
Pattern
Request arrives → introspect token → identify tenant → check tenant's quotaasync function rateLimit(tenant: { id: string; plan: string }) {
const limits = {
free: { rps: 1.6, burst: 100 },
paid: { rps: 167, burst: 10_000 },
enterprise: { rps: 1667, burst: 100_000 },
};
const limit = limits[tenant.plan];
const key = `rl:tenant:${tenant.id}:${Math.floor(Date.now() / 60_000)}`;
const count = await redis.incr(key);
if (count === 1) await redis.expire(key, 60);
return count <= limit.burst;
}Apply in middleware
app.use("/api", async (req, res, next) => {
const token = req.headers.authorization?.slice(7);
const intro = await introspect(token);
const tenantId = intro.tenant_id;
const tenant = await getTenant(tenantId);
if (!await rateLimit(tenant)) {
res.set({
"X-RateLimit-Limit": "100",
"X-RateLimit-Remaining": "0",
"Retry-After": "60",
});
return res.status(429).json({ error: "rate_limited" });
}
next();
});Tier definitions
Store in DB / settings vault:
CREATE TABLE rate_tiers (
name TEXT PRIMARY KEY,
rps INTEGER NOT NULL,
burst INTEGER NOT NULL,
daily_quota INTEGER -- optional cap
);
INSERT INTO rate_tiers VALUES
('free', 2, 100, 10000),
('paid', 167, 10000, NULL),
('enterprise', 1667, 100000, NULL);Edit via Athena UI.
Custom overrides
Some customers negotiate custom limits:
CREATE TABLE tenant_rate_overrides (
tenant_id UUID PRIMARY KEY,
rps INTEGER,
burst INTEGER,
reason TEXT
);const override = await getTenantOverride(tenantId);
const effectiveLimit = override ?? defaultForPlan(tenant.plan);Per-endpoint differentiation
Some endpoints expensive, some cheap:
const endpointWeights = {
"/api/search": 10, // costs 10 units
"/api/get": 1, // 1 unit
"/api/heavy-compute": 100,
};
const weight = endpointWeights[req.path] ?? 1;
const count = await redis.incrBy(key, weight); // increment by weightExpensive endpoints eat budget faster. Tenant on free tier can do 100 cheap calls OR 10 search calls per minute.
Sliding window
Fixed window resets at clock boundaries, burst-tolerant. Sliding is smoother:
async function slidingWindowCheck(key: string, limit: number, windowMs: number) {
const now = Date.now();
await redis.zremrangebyscore(key, 0, now - windowMs);
const count = await redis.zcard(key);
if (count >= limit) return false;
await redis.zadd(key, now, `${now}-${Math.random()}`);
await redis.expire(key, Math.ceil(windowMs / 1000));
return true;
}~2x Redis ops vs fixed but no burst spikes.
Headers
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 743
X-RateLimit-Reset: 1715200000Help client throttle itself.
For 429:
Retry-After: 30Seconds until limit resets.
Plan changes
When tenant upgrades plan:
async function changePlan(tenantId, newPlan) {
await db`UPDATE tenants SET plan = ${newPlan} WHERE id = ${tenantId}`;
// Existing rate counters expire naturally (max 1 min)
// New plan takes effect within 60s
}For instant: clear their rate counters:
await redis.keys(`rl:tenant:${tenantId}:*`).then(keys =>
Promise.all(keys.map(k => redis.del(k)))
);Monitoring
metrics.increment("rate_limit", { tenant: tenantId, outcome: "throttled" });Dashboard: which tenants hit limits most? Common reasons:
- Buggy client (loop without backoff).
- Tenant outgrew their plan (upsell opportunity).
- Abuse.
Communications
If tenant hits limit often:
Subject: You're approaching your API limit
We noticed your account is consistently hitting API rate limits this week.
Your current plan: Free (100 req/min).
Your peak usage: 250 req/min during business hours.
Consider upgrading to avoid throttling:
[See plans]Sales opportunity, not just enforcement.
Caddy as outer layer
Caddy's rate_limit module is per-IP, not per-tenant. Use for coarse outer (DDoS protection); per-tenant logic in app:
@api path /api/*
rate_limit @api {
zone caddy-outer
events 100000 # very high, just for sanity
window 1m
key {http.request.remote_host}
}Tenant-level enforcement inside.
Multiple keys
Per-tenant + per-endpoint + per-user:
const tenantOk = await checkLimit(`tenant:${tenantId}`, ...);
const endpointOk = await checkLimit(`tenant:${tenantId}:${path}`, ...);
const userOk = await checkLimit(`user:${userId}`, ...);
const allOk = tenantOk && endpointOk && userOk;Hierarchy: tenant has overall cap, plus per-endpoint, plus per-user.
Quota vs rate limit
Rate limit: requests per unit time. Quota: total per billing period.
CREATE TABLE tenant_usage (
tenant_id UUID,
month DATE,
api_calls BIGINT,
PRIMARY KEY (tenant_id, month)
);
-- On each request:
UPDATE tenant_usage
SET api_calls = api_calls + 1
WHERE tenant_id = ... AND month = DATE_TRUNC('month', NOW());Check monthly quota separately from per-second rate limit.