Microsoft Presidio Recall is an open-source tool for identifying and redacting sensitive information from unstructured text and stored data. It builds on the Microsoft Presidio suite, but focuses on recall rates—how well you detect every piece of sensitive data without missing any. In regulated environments, a false negative can be more dangerous than a false positive. This makes Presidio Recall critical for data protection workflows.
Presidio Recall uses deterministic and statistical methods to search large datasets for personally identifiable information (PII) such as names, phone numbers, email addresses, IP addresses, and more. You can integrate it directly into pipelines that process logs, customer communications, or documents. Its architecture allows for modular recognizers, customizable patterns, and language-specific tuning.
High recall comes at a cost: more potential false positives. Presidio Recall lets you manage that trade-off through confidence scoring and recognizer configuration. Engineers can tune detection models to optimize recall while controlling precision, ensuring compliance without stalling operations.