Awesome-PII is a tool collection related to detecting, extracting, and removing PII from data.
The regexes.yaml schema may change. I may change the schema later for future versions.
Regexes are tested with ruby 2.5.9 on Rubular.
Regexes are downcase sensitive. (meaning downcase text before using them)
Regexes are incomplete. Pull Requests welcome.
- Phone
- US
- Socials
- TikTok
- Snapchat
- Telegram
- ID card
- Social Security Number
- Internet
- IP Address
- Domain Name
- Personal
- Date of Birth (US)
- Race
- Religion
-
Regexes
- Implement regex groups to extract parts of regex
- (ongoing) add support for obsfuncation (e.g. "s.c." means snapchat)
- regexes with PII
- remove PII from text (with ChatGPT)
-
Images
- Image PII (with OCR)
- Image PII removal (with OCR and Stable Diffusion)
[ ] Multi-language support
- Regular Expressions to Match Social Media Profiles
- Spark Start
- Author
- Everyone who stars this repo