Bot detection strategies

Bot traffic on auth endpoints: credential stuffing, account creation, password resets for reconnaissance. Detect and mitigate without making real users click 15 traffic-light CAPTCHAs.

Signal layers

Layer 1: Rate-based

Just rate limiting catches the dumbest bots:

@login_post path /self-service/login method POST
rate_limit @login_post {
  zone login-burst
  events 5
  window 1m
  key {http.request.remote_host}
}

Effective against single-IP brute-force.

Layer 2: Behavioral

Bots tend to:

Fill forms in <100ms (humans take >1s).
Have no mouse movement (events).
Tab order skipping (filling fields out of order).

Capture via JS:

window.addEventListener("DOMContentLoaded", () => {
  const start = Date.now();
  document.querySelector("form").addEventListener("submit", (e) => {
    const fillTime = Date.now() - start;
    e.target.dataset.fillTime = fillTime;
  });
});

Server-side: if fill_time < 500, treat as suspicious.

Layer 3: Honeypot fields

Add an invisible field. Real users won't fill it; bots will:

<input type="text" name="email_confirm" tabindex="-1" autocomplete="off" 
       style="position: absolute; left: -9999px;">

Server-side:

if (req.body.email_confirm) {
  return res.status(400).json({ error: "spam_detected" });
}

Catches indiscriminate form-fillers. Sophisticated bots ignore.

Layer 4: Risk scoring

Combine signals into a score:

let score = 0;
if (fillTime < 500) score += 30;
if (!hasMouseEvents) score += 20;
if (newIpAddress) score += 10;
if (knownBadAsn(ip)) score += 40;
if (failedLastNAttempts(ip) >= 5) score += 30;

if (score >= 50) showCaptcha = true;
if (score >= 80) reject();

Tune thresholds against your own data.

Layer 5: CAPTCHA

Only triggered when risk score is high. Don't show on every signup.

{showCaptcha && <Turnstile sitekey={SITE_KEY} onVerify={setToken} />}

Cloudflare Turnstile is the user-friendliest. Doesn't ask user to do anything most of the time.

Layer 6: ML-based services

Cloudflare Bot Management, DataDome, hCaptcha Enterprise, etc. They handle the layers above and more sophisticated stuff:

TLS fingerprinting (different from browser fingerprinting).
Browser env consistency checks.
Real-time IP reputation.

Cost: ~$200-500/mo at moderate scale.

What to do with a detected bot

Soft response: rate limit aggressively

if (isBotSuspect) {
  await rateLimiter.add(ip, "bot", "1/min");
}

Bot retries but is throttled to crawl pace.

Medium response: shadow ban

Accept the registration but flag the identity. Don't actually log them in or send emails:

if (isBot) {
  await kratos.adminCreateIdentity({
    traits,
    metadata_admin: { shadow_banned: true, reason: "bot_detected" },
    state: "inactive",
  });
  // Return success but DON'T sign in
  return Response.json({ session: null });
}

Bot doesn't realize it failed; retries less.

Hard response: reject

if (isBot) return Response.json({ error: "blocked" }, 403);

Bot reads 403 and either gives up or rotates IPs.

Common bot patterns to detect

Credential stuffing

Many different emails, each tried once with a password.
Source IPs widely distributed (residential proxies).
High failure rate.

Mitigate:

Have I Been Pwned check on password.
CAPTCHA on >3 failed attempts per IP/per email.
Account lockout after N attempts (per identity).

Account farming

Many account creations in short time.
Similar email patterns (alex234@gmail.com, alex235@gmail.com).
Disposable email domains.

Mitigate:

Email-domain reputation.
Disposable-domain block list.
Behavioral: time to fill form.

Account enumeration

Probe for which emails exist.
Differentiated responses (timing, error messages).

Mitigate:

Constant-time login responses (Olympus does this).
Generic error messages ("Invalid email or password", not "no such user").
Rate limit probing.

Don't over-block

False positives cost users. Conservatively:

Tier 1 (block): only with very high confidence.
Tier 2 (CAPTCHA): for medium-confidence.
Tier 3 (allow with logging): for low-confidence.

Monitor:

Block rate per cohort (real users shouldn't see CAPTCHA on first try).
Customer complaints (someone said "I can't sign up?").

Logging

Log every bot detection so you can analyze:

await db.insert(botEvents).values({
  ip: ip,
  user_agent: ua,
  score: score,
  signals: { fillTime, honeypot, hasMouseEvents },
  action: result,  // "allow" | "captcha" | "block"
  endpoint: req.path,
});

Weekly review:

Which IPs are repeat offenders?
Are real users being blocked?
New attack patterns to detect?

Bot detection strategies

On this page