Core Concepts
Detection, validation, privacy transformation modes, overlap resolution, and pattern extensibility in FastPII.
Core Concepts
Detection
Use detect() to scan text for supported entities.
from fastpii import PrivacyGuard
guard = PrivacyGuard(regions=["cz"])
result = guard.detect("Email: jan.novak@example.cz, RČ: 8001011238")detect() returns a DetectionResult with a list[Finding] in result.findings.
Each Finding contains:
typevaluestartendconfidenceregionmetadata
Validation
Use validate() when you already know the detector type and need to confirm whether a single value is structurally valid.
from fastpii import PrivacyGuard
guard = PrivacyGuard(regions=["cz"])
result = guard.validate("8001011238", "rodne_cislo")validate() returns a ValidationResult with:
is_validmetadata
For checksum-backed identifiers, validation uses official structural rules such as Mod 11 and weighted checksum calculations.
Anonymization
Use anonymize() to replace every detected value with the same placeholder.
guard.anonymize("Email: jan.novak@example.cz, RČ: 8001011238")Output:
Email: [REDACTED], RČ: [REDACTED]You can also pass a custom replacement string:
guard.anonymize("Email: jan.novak@example.cz", replacement="[PII]")Redaction
Use redact() to replace each finding with a type-specific label.
guard.redact("Email: jan.novak@example.cz, RČ: 8001011238")Output:
Email: [EMAIL], RČ: [RODNE_CISLO]Masking
Use mask() to replace each finding with asterisks while preserving original length.
guard.mask("Email: jan.novak@example.cz, RČ: 8001011238")Output:
Email: ********************, RČ: **********Removal
Use remove() to delete detected values entirely.
guard.remove("Email: jan.novak@example.cz, RČ: 8001011238")Output:
Email: , RČ: Overlap resolution
When multiple findings overlap, FastPII deduplicates them before returning the final DetectionResult.
Resolution order is based on:
- confidence
- type priority
- span length
The built-in priority logic favors checksum-backed identifiers over broader personal or contact entities. In the current implementation, the priority order includes:
- checksum-backed identifiers first:
rodne_cislo,ico,dic,bank_account - then broader entities such as
address,email,date_of_birth,name - then lower-priority spans such as
postal_code,vehicle_plate,phone
This reduces false positives when one detector captures a substring that belongs to a stronger match.
Pattern registry
FastPII uses a pattern registry and detector registry internally. PrivacyGuard registers built-in detectors for region cz, and custom detectors can also be registered programmatically.
This architecture is what makes regional extension possible without changing the PrivacyGuard public API.