Class
: Practical Data Science (IDS 720), Fall-2023Professor
: Nick EubankTeam Member
: Meixiang Du, Revanth Chowdary Ganga, Titus Robin Arun, Suim Park
- 01_Data : Contains the data files used in this proejct,
- the 01_Raw subfolder has the raw data sourced from external sources
- The 02_Processed subfolder has the intermediate and final data sets generated by the codes in this project
- 02_Codes : Contains all the codes used in this project:
in general the .ipynb files have been used for EDA and Analysis purposes, the .py scripts are used for the final genration of data and results
- 01_Population subfolder contains the codes to process the population dataset
- 02_Mortality subfodler contains the codes to process the mortality data and to generate the plots
- 03_Shipment subfodler contains the codes to process the shipment data and to generate the plots
- 04_Other subfolder contains old and redundant files which are not required for the project but have been kept for reference purposes
- 03_Plots : Contains the output plots generated by the codes which are organised under the relevant subfolders
- 04_Documents : Contains reference documents and reports submitted during the project
- Clone the repository to your local machine
- Download the raw shipment data from the following Link
- Rest of the required raw data is already present in the repository
- To genrate the outputs, run the .py script files in the order of Folder and Order within the folder, i.e. execute the codes is 01_Population then 02_Mortality etc. and follow the order within these folders
- The output data will be saved in the "02_Processed" subfolder under "01_Data" folder and the plots will be saved under the relevant subfolders in "03_Plots"
Note: except 01_shipment_combine.py, all other files can be run directly since the source data is present within the directory and their relative paths are already added in the codes. for 01_shipment_combine.py, update it to the local path of the data you downloaded in step 2 in your system before executing it.