Skip to content

Takes an email inbox in MBOX format and finds image attachement and sorts them by the difference between the time the image was taken and the time the email was sent

Notifications You must be signed in to change notification settings

tgthorley/email_and_image_analysis

Repository files navigation

Email Analysis Tool

This repo is intended to be a set of tools for basic analysis of email inboxes, for initial triage only.

Email Parser - Images

Takes an email inbox in MBOX form and outputs a list of all the emails with images attached and the time differences between when they were sent and when the images were created

The images are also decoded and pulled out of the inbox for further Analysis

Email Parser - Text

Takes an email inbox in MBOX form and extracts text from various MIME types (docx, pdf, plain) and outputs a list of the emails with text (cleaned and preprocessed)

Facial Recognition

Takes a folder of images (e.g. extracted from an inbox by Email Parser - Images) and tries to identify faces in the images. Outputs two folders one with the images it thinks contains faces and one that contains images it thinks do not.

Also puts face images into equal length vectors (using PCA) to enable follow on analysis (e.g. image similarity)

Convert MailDir to MBox

Converts a MailDir folder (and it's subfolder) to an MBox file for use in Email Parser - Images and Email Parser - Text.

About

Takes an email inbox in MBOX format and finds image attachement and sorts them by the difference between the time the image was taken and the time the email was sent

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published