Git reset saved the repo. PII detection saved the company.
Cleaning sensitive information from Git history is not just a code hygiene task. It’s survival. Personal Identifiable Information—emails, phone numbers, API keys, customer IDs—lives dangerously if it ever gets committed. The moment it enters your Git history, it spreads through every clone, every fork, every pull. A quick revert won’t erase it. You need to rewrite history.
That’s where git reset meets automated PII detection. git reset --hard can roll back changes in your working directory. But if the bad commit is pushed and baked into history, you need tools like git filter-repo or BFG Repo-Cleaner paired with a PII scanner. The scanner finds what humans miss. The cleaner removes it everywhere.
Relying on manual reviews for sensitive data is not enough. Regex checks only catch patterns you expect. Machine-driven detection can spot subtle leaks: customer reference numbers in logs, email strings in config files, even tokens embedded deep in JSON. Integrating detection into CI/CD makes it impossible to merge without passing a full PII scan.