updating a series of py scripts for SP Reputation analysis #2

TorfinnOlsen · 2023-10-30T19:40:15Z

spReputationAnalysis.py

Will hit the mongoDB and give you a high level cursory analysis of similarities between collections, list of collections and unique ones in the different tables, and samples from each of them so you can get a visual recognition of what's in these DB's. Run this first and you've got a good starting point of what you're looking at. Also gives you the most recent date a record was added to give you a sense of when the table was last updated.

spReputationDateAnalysis.py

Hits Mongo online and does a more indepth analysis of the date ranges of records so you know what's covered in each of the DB's you're looking at.

spReputationExport.py

This will export all of the DB's in TOTAL (they're pretty small less than a gig but still quite a bit to try and load up in a google sheet) to csv's.

load_csvs.py

This assumes you've installed pandas, and will load everything up into data frames so you can do a cleaning process across all the data as one set and start analysis of dates of occurrences, frequency, etc. I think I can bang out a handful of scripts that do this high level here.

willscott · 2023-10-31T15:28:54Z

Your PR doesn't include the contents of these scripts yet

updating a series of py scripts for SP Reputation analysis

234e2fe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

updating a series of py scripts for SP Reputation analysis #2

updating a series of py scripts for SP Reputation analysis #2

TorfinnOlsen commented Oct 30, 2023

willscott commented Oct 31, 2023

updating a series of py scripts for SP Reputation analysis #2

Are you sure you want to change the base?

updating a series of py scripts for SP Reputation analysis #2

Conversation

TorfinnOlsen commented Oct 30, 2023

willscott commented Oct 31, 2023