Olympus Docs
CookbookOperations

PII redaction in logs

Don't leak personal data into log files

Logs are often forwarded to external systems (SIEM, log aggregator, ELK), so PII in logs leaks to those systems. Some PII is harmless; some is regulated. Mismanaged logs are a GDPR risk.

What is PII

For GDPR / CCPA purposes:

  • Email addresses.
  • Names.
  • Phone numbers.
  • IP addresses (yes, in EU).
  • Identity UUID (linkable identifier).

Less obviously:

  • User-Agent (linkable).
  • Geographic info (city, country).
  • Free-text user-provided fields.

What's "safe enough"

Internal logs in your own DB (audit log): PII OK with retention policy. Logs in external SaaS (Datadog, Sentry): redact. Logs in shared file system (rotated to S3): minimize.

Redaction strategies

Strategy A: Don't log it

// Don't
logger.info({ email: user.email, action: "login" });

// Do
logger.info({ user_id: user.id, action: "login" });

Use IDs as references. Audit log (separate, retention-managed) has the email.

Strategy B: Hash

import { createHash } from "crypto";
function maskEmail(email: string) {
  // alice@example.com → "a***@example.com"
  const [local, domain] = email.split("@");
  return `${local[0]}***@${domain}`;
}
logger.info({ user_email_masked: maskEmail(user.email) });

Pattern-preserving, useful for debugging without exposing.

Strategy C: Tokenize

Replace PII with a token; keep the mapping in a secure separate store:

const token = await tokenize(user.email);
// Stores `email_token:X → real email` in encrypted vault
logger.info({ user_email_token: token });

Reversible only by someone with vault access. Rarely worth the complexity for most apps.

Implementation

Pino (Node.js)

import pino from "pino";

const logger = pino({
  redact: {
    paths: [
      "*.email",
      "*.phone",
      "request.headers.authorization",
      "request.headers.cookie",
      "response.body.access_token",
      "response.body.refresh_token",
    ],
    censor: "***REDACTED***",
  },
});

Pino strips these from every log output.

Bunyan / Winston

Similar redaction options. Document a list of fields to strip.

Caddy access logs

log {
  format json
  exclude {
    Authorization
    Cookie
    Set-Cookie
  }
}

Removes those headers from access logs.

Sentry / Bugsnag / error-trackers

Configure scrubbing:

Sentry.init({
  beforeSend(event) {
    if (event.request?.cookies) delete event.request.cookies;
    if (event.user?.email) event.user.email = maskEmail(event.user.email);
    return event;
  },
});

Datadog / Loki / Splunk

Most log aggregators have field-redaction config server-side. Configure ASAP after onboarding, don't rely on it solely (defense in depth).

Stack traces

Stack traces can include variable values, which can include PII:

Error: failed to send
  at sendEmail (...)
  email = "alice@example.com"
  attempts = 3

Most logging frameworks suppress local variables in production. Verify yours does.

NODE_ENV=production node app.js  # Pino: trims source info by default

Database queries

Don't log queries that include PII:

// Bad
db.query("SELECT * FROM users WHERE email = $1", [email]).then(...)
// drizzle's logger:
logger.debug(`Query: SELECT * FROM users WHERE email = 'alice@example.com'`);

Configure your ORM to log parameterized form, not interpolated:

SELECT * FROM users WHERE email = $1
[REDACTED]

URLs in logs

GET /users/01HZQ2X.../sessions

Identity UUIDs in paths are PII (linkable). Either:

  • Don't log full paths.
  • Mask the UUID portion.
function redactPath(path: string) {
  return path.replace(/\/users\/[^/]+/, "/users/<id>");
}

What you DO want to log

Despite redaction, logs should be useful:

UsefulSensitive
Event types (login_failed, registration)The user's email
Outcomes (success, validation_error)The actual password attempt
Response codesCookie values
LatencyToken contents
Path patterns (/users/<id>)Full URLs with PII

Audit logs are different

The audit log (security_audit table) is supposed to contain PII, that's the point. It's separate from app logs:

  • Stored encrypted at rest.
  • Retention-limited (90 days for routine).
  • Access-controlled.
  • Not forwarded to external systems by default.

Test for leaks

Periodic grep over a sample of recent logs:

grep -E "(@|password|token|secret)" /var/log/app.log | head -100

If you see PII / secrets, fix the source.

Automated: use an external scanner (gitleaks, trufflehog) on log streams to catch secrets leaks.

GDPR-specific

If you're notified by a user of a DSR (data subject request):

  • The audit log gets cleared per their request.
  • App logs SHOULD already be PII-free.

If app logs DO contain their email, you'll need to expunge those too. Painful, operationally avoid.

On this page