Per-tenant rate limits

In multi-tenant SaaS, you might want:

Free tier: 100 req/min.
Paid: 10,000 req/min.
Enterprise: 100,000 req/min.

Per-tenant rate limits enforce this at your API.

Pattern

Request arrives → introspect token → identify tenant → check tenant's quota

async function rateLimit(tenant: { id: string; plan: string }) {
  const limits = {
    free: { rps: 1.6, burst: 100 },
    paid: { rps: 167, burst: 10_000 },
    enterprise: { rps: 1667, burst: 100_000 },
  };
  const limit = limits[tenant.plan];
  
  const key = `rl:tenant:${tenant.id}:${Math.floor(Date.now() / 60_000)}`;
  const count = await redis.incr(key);
  if (count === 1) await redis.expire(key, 60);
  return count <= limit.burst;
}

Apply in middleware

app.use("/api", async (req, res, next) => {
  const token = req.headers.authorization?.slice(7);
  const intro = await introspect(token);
  const tenantId = intro.tenant_id;
  const tenant = await getTenant(tenantId);
  
  if (!await rateLimit(tenant)) {
    res.set({
      "X-RateLimit-Limit": "100",
      "X-RateLimit-Remaining": "0",
      "Retry-After": "60",
    });
    return res.status(429).json({ error: "rate_limited" });
  }
  
  next();
});

Tier definitions

Store in DB / settings vault:

CREATE TABLE rate_tiers (
  name TEXT PRIMARY KEY,
  rps INTEGER NOT NULL,
  burst INTEGER NOT NULL,
  daily_quota INTEGER  -- optional cap
);

INSERT INTO rate_tiers VALUES
  ('free', 2, 100, 10000),
  ('paid', 167, 10000, NULL),
  ('enterprise', 1667, 100000, NULL);

Edit via Athena UI.

Custom overrides

Some customers negotiate custom limits:

CREATE TABLE tenant_rate_overrides (
  tenant_id UUID PRIMARY KEY,
  rps INTEGER,
  burst INTEGER,
  reason TEXT
);

const override = await getTenantOverride(tenantId);
const effectiveLimit = override ?? defaultForPlan(tenant.plan);

Per-endpoint differentiation

Some endpoints expensive, some cheap:

const endpointWeights = {
  "/api/search": 10,      // costs 10 units
  "/api/get": 1,          // 1 unit
  "/api/heavy-compute": 100,
};

const weight = endpointWeights[req.path] ?? 1;
const count = await redis.incrBy(key, weight);  // increment by weight

Expensive endpoints eat budget faster. Tenant on free tier can do 100 cheap calls OR 10 search calls per minute.

Sliding window

Fixed window resets at clock boundaries, burst-tolerant. Sliding is smoother:

async function slidingWindowCheck(key: string, limit: number, windowMs: number) {
  const now = Date.now();
  await redis.zremrangebyscore(key, 0, now - windowMs);
  const count = await redis.zcard(key);
  if (count >= limit) return false;
  await redis.zadd(key, now, `${now}-${Math.random()}`);
  await redis.expire(key, Math.ceil(windowMs / 1000));
  return true;
}

~2x Redis ops vs fixed but no burst spikes.

Headers

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 743
X-RateLimit-Reset: 1715200000

Help client throttle itself.

For 429:

Retry-After: 30

Seconds until limit resets.

Plan changes

When tenant upgrades plan:

async function changePlan(tenantId, newPlan) {
  await db`UPDATE tenants SET plan = ${newPlan} WHERE id = ${tenantId}`;
  // Existing rate counters expire naturally (max 1 min)
  // New plan takes effect within 60s
}

For instant: clear their rate counters:

await redis.keys(`rl:tenant:${tenantId}:*`).then(keys => 
  Promise.all(keys.map(k => redis.del(k)))
);

Monitoring

metrics.increment("rate_limit", { tenant: tenantId, outcome: "throttled" });

Dashboard: which tenants hit limits most? Common reasons:

Buggy client (loop without backoff).
Tenant outgrew their plan (upsell opportunity).
Abuse.

Communications

If tenant hits limit often:

Subject: You're approaching your API limit

We noticed your account is consistently hitting API rate limits this week.

Your current plan: Free (100 req/min).
Your peak usage: 250 req/min during business hours.

Consider upgrading to avoid throttling:
[See plans]

Sales opportunity, not just enforcement.

Caddy as outer layer

Caddy's rate_limit module is per-IP, not per-tenant. Use for coarse outer (DDoS protection); per-tenant logic in app:

@api path /api/*
rate_limit @api {
  zone caddy-outer
  events 100000   # very high, just for sanity
  window 1m
  key {http.request.remote_host}
}

Tenant-level enforcement inside.

Multiple keys

Per-tenant + per-endpoint + per-user:

const tenantOk = await checkLimit(`tenant:${tenantId}`, ...);
const endpointOk = await checkLimit(`tenant:${tenantId}:${path}`, ...);
const userOk = await checkLimit(`user:${userId}`, ...);
const allOk = tenantOk && endpointOk && userOk;

Hierarchy: tenant has overall cap, plus per-endpoint, plus per-user.

Quota vs rate limit

Rate limit: requests per unit time. Quota: total per billing period.

CREATE TABLE tenant_usage (
  tenant_id UUID,
  month DATE,
  api_calls BIGINT,
  PRIMARY KEY (tenant_id, month)
);

-- On each request:
UPDATE tenant_usage 
SET api_calls = api_calls + 1
WHERE tenant_id = ... AND month = DATE_TRUNC('month', NOW());

Check monthly quota separately from per-second rate limit.

Per-tenant rate limits

On this page