PII redaction in logs

Logs are often forwarded to external systems (SIEM, log aggregator, ELK), so PII in logs leaks to those systems. Some PII is harmless; some is regulated. Mismanaged logs are a GDPR risk.

What is PII

For GDPR / CCPA purposes:

Email addresses.
Names.
Phone numbers.
IP addresses (yes, in EU).
Identity UUID (linkable identifier).

Less obviously:

User-Agent (linkable).
Geographic info (city, country).
Free-text user-provided fields.

What's "safe enough"

Internal logs in your own DB (audit log): PII OK with retention policy. Logs in external SaaS (Datadog, Sentry): redact. Logs in shared file system (rotated to S3): minimize.

Redaction strategies

Strategy A: Don't log it

// Don't
logger.info({ email: user.email, action: "login" });

// Do
logger.info({ user_id: user.id, action: "login" });

Use IDs as references. Audit log (separate, retention-managed) has the email.

Strategy B: Hash

import { createHash } from "crypto";
function maskEmail(email: string) {
  // alice@example.com → "a***@example.com"
  const [local, domain] = email.split("@");
  return `${local[0]}***@${domain}`;
}
logger.info({ user_email_masked: maskEmail(user.email) });

Pattern-preserving, useful for debugging without exposing.

Strategy C: Tokenize

Replace PII with a token; keep the mapping in a secure separate store:

const token = await tokenize(user.email);
// Stores `email_token:X → real email` in encrypted vault
logger.info({ user_email_token: token });

Reversible only by someone with vault access. Rarely worth the complexity for most apps.

Implementation

Pino (Node.js)

import pino from "pino";

const logger = pino({
  redact: {
    paths: [
      "*.email",
      "*.phone",
      "request.headers.authorization",
      "request.headers.cookie",
      "response.body.access_token",
      "response.body.refresh_token",
    ],
    censor: "***REDACTED***",
  },
});

Pino strips these from every log output.

Bunyan / Winston

Similar redaction options. Document a list of fields to strip.

Caddy access logs

log {
  format json
  exclude {
    Authorization
    Cookie
    Set-Cookie
  }
}

Removes those headers from access logs.

Sentry / Bugsnag / error-trackers

Configure scrubbing:

Sentry.init({
  beforeSend(event) {
    if (event.request?.cookies) delete event.request.cookies;
    if (event.user?.email) event.user.email = maskEmail(event.user.email);
    return event;
  },
});

Datadog / Loki / Splunk

Most log aggregators have field-redaction config server-side. Configure ASAP after onboarding, don't rely on it solely (defense in depth).

Stack traces

Stack traces can include variable values, which can include PII:

Error: failed to send
  at sendEmail (...)
  email = "alice@example.com"
  attempts = 3

Most logging frameworks suppress local variables in production. Verify yours does.

NODE_ENV=production node app.js  # Pino: trims source info by default

Database queries

Don't log queries that include PII:

// Bad
db.query("SELECT * FROM users WHERE email = $1", [email]).then(...)
// drizzle's logger:
logger.debug(`Query: SELECT * FROM users WHERE email = 'alice@example.com'`);

Configure your ORM to log parameterized form, not interpolated:

SELECT * FROM users WHERE email = $1
[REDACTED]

URLs in logs

GET /users/01HZQ2X.../sessions

Identity UUIDs in paths are PII (linkable). Either:

Don't log full paths.
Mask the UUID portion.

function redactPath(path: string) {
  return path.replace(/\/users\/[^/]+/, "/users/<id>");
}

What you DO want to log

Despite redaction, logs should be useful:

Useful	Sensitive
Event types (login_failed, registration)	The user's email
Outcomes (success, validation_error)	The actual password attempt
Response codes	Cookie values
Latency	Token contents
Path patterns (`/users/<id>`)	Full URLs with PII

Audit logs are different

The audit log (security_audit table) is supposed to contain PII, that's the point. It's separate from app logs:

Stored encrypted at rest.
Retention-limited (90 days for routine).
Access-controlled.
Not forwarded to external systems by default.

Test for leaks

Periodic grep over a sample of recent logs:

grep -E "(@|password|token|secret)" /var/log/app.log | head -100

If you see PII / secrets, fix the source.

Automated: use an external scanner (gitleaks, trufflehog) on log streams to catch secrets leaks.

If you're notified by a user of a DSR (data subject request):

The audit log gets cleared per their request.
App logs SHOULD already be PII-free.

If app logs DO contain their email, you'll need to expunge those too. Painful, operationally avoid.

PII redaction in logs

On this page