Skip to content

maiwending/email-extractor

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Extract emails from rtf,txt,text,doc,docx and PDF file

  • Install Python3 and Pip3
  • pip3 install -r requirements.txt
  • python extract_emails.py --help Note: If your file has doc extension then you must have
  • On windows you must install pypiwin32
  • On Linux or Mac Install Libre Office

pypiwin32 is Windows python module so ignore install error on linux based os.

Options

  • --dir option to provide the directory/folder absolute path, default is current folder
  • --file option to scan only one file
  • --ext option to restrict the scanning of file extensions, default all supported extensions
  • --dst option to set the output file name, by default it will print on the console

NOTE: Change output file for each run otherwise it will overwrite the existing results.

Usage

Extract emails from a specific file xyz.pdf

python extract_emails.py --file=xyz.pdf --dst=emails.txt

Extract emails from all files from a folder/directory XYZ

python extract_emails.py --dir=XYZ --dst=emails.txt

While scanning a folder/directory you can specify file extensions as well, for example it should only scan pdf files then do

python extract_emails.py --dir=XYZ --dst=emails.txt --ext pdf

Scan directory but only parse doc and pdf files

python extract_emails.py --dir=XYZ --dst=emails.txt --ext pdf doc

About

Extract emails from different documents

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%