Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Master To Do #2

Open
5 of 12 tasks
egmavis opened this issue Apr 11, 2022 · 1 comment
Open
5 of 12 tasks

Master To Do #2

egmavis opened this issue Apr 11, 2022 · 1 comment

Comments

@egmavis
Copy link
Contributor

egmavis commented Apr 11, 2022

Data Exploration

  • show initial balance of gender and race of raw data

Pre-processing

  • create dataframes
  • create different balanced/imbalanced data sets
  • combine indian and asian labels into one "asian" label, adjust "other" label accordingly

Modeling (on all data split using train/val/test)

  • Logistic Regression (adjusting regularization hyperparameter)
  • Random Forest (more advanced baseline model, start with default parameters)
  • Tensorflow NN (additional workflow required)

Select best performing model for running the following experiments (will prob be the NN):

Experiments

IMG_5163

  • Leave-one-out racial subgroups (white, black, asian, other)
  • final chosen model (if too slow, random forest)
  • Skewed gender balance (80% male 20% female, and 20% male 80% female)
  • final chosen model (if too slow, random forest)

Visualizations

  • ROC/PR curves for each model (data as a whole and for subgroup experiments)
  • F1 scores

Final Reporting

  • Follow rubric to confirm all expectations are met

Presentation Slides

Notes

  • we will use all data instead of random sample
  • each modeling process overall and for each experiment needs to be redone so as to avoid data snooping
@nickeubank
Copy link

Look at you using issues and checklists for Kyle's class too! Yay!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants