Masking email addresses in logs isn’t about paranoia. It’s about control. Self-hosted instances carry their own risks: you own the stack, so you own the leaks. Once plain-text emails hit your logs, they’re hard to pull back. Compliance teams start asking questions. Privacy rules tighten their grip. Even if you rotate logs, the damage can spread fast across backups, caches, or analytics pipelines.
The first step is to treat log data as a potential liability, not just a debug tool. Logging every request means logging every piece of user data unless you explicitly shield it. Emails are common because they’re unique identifiers in authentication flows, contact forms, and API calls. Masking means transforming them before they touch disk—replacing local parts with a placeholder while keeping enough structure for troubleshooting.
A robust pattern is to intercept at the logger level. Whether using Winston, Bunyan, pino, or language-native loggers, inject a sanitizer function in the pipeline. Use regex to match the username@domain.com structure, then mask user identifiers. For example, user@example.com could become u***@example.com or [redacted]@example.com. Avoid writing your own fragile regex if your framework already has a safe helper. Test against edge cases like plus signs, subdomains, and uncommon TLDs.
The second pillar is sanitizing at log ingestion in your observability stack. Self-hosted instances of Elasticsearch, OpenSearch, or Loki should expose ingestion pipelines where you can strip or replace sensitive fields. This adds defense in depth—if one service forgets to mask, another catches it before storage.