Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updating a series of py scripts for SP Reputation analysis #2

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

TorfinnOlsen
Copy link
Collaborator

  • spReputationAnalysis.py

Will hit the mongoDB and give you a high level cursory analysis of similarities between collections, list of collections and unique ones in the different tables, and samples from each of them so you can get a visual recognition of what's in these DB's. Run this first and you've got a good starting point of what you're looking at. Also gives you the most recent date a record was added to give you a sense of when the table was last updated.

  • spReputationDateAnalysis.py

Hits Mongo online and does a more indepth analysis of the date ranges of records so you know what's covered in each of the DB's you're looking at.

  • spReputationExport.py

This will export all of the DB's in TOTAL (they're pretty small less than a gig but still quite a bit to try and load up in a google sheet) to csv's.

  • load_csvs.py

This assumes you've installed pandas, and will load everything up into data frames so you can do a cleaning process across all the data as one set and start analysis of dates of occurrences, frequency, etc. I think I can bang out a handful of scripts that do this high level here.

@willscott
Copy link
Collaborator

Your PR doesn't include the contents of these scripts yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants