The results from the initial major df-analyze
run using only the LightGBM model
are located in results/traffic_results_subset
.
Data should be downloaded from the Montgomery Count
Website
and saved in the project root as traffic_violations_complete.csv
(see
clean_data.py
).
clean_data.py
: performs cleaning of the raw data, and places outputs intraffic_data
summary.py
: summarizes the results of the full job (see "Job Scripts" below) into various tables (some printed to stdout, some saved)
Contains the actual scripts for running on Compute Canada. Will require setting
up a container to run df-analyze
.