Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Performance and Fairness Evaluations to Jupyter Notebook #54

Merged
merged 5 commits into from
Nov 13, 2023

Conversation

ssaloos
Copy link
Contributor

@ssaloos ssaloos commented Nov 9, 2023

Performance Evaluation
I performed performance evaluation on the test dataset by

  1. Calculating accuracy, precision, recall, and then generating the classification report and confusion matrix.
  2. Calculating true positives, true negatives, false positives, and false negatives.
  3. Plotting the distribution of the fields (except Gender) for each gender to analyze the data.
  4. Calculating scores for actual and predicted for each gender, and checking the accuracy score and confusion matrix for each, as well as the True Positive Rate (Recall).
  5. Calculating predicted vs. actual scores for Good Candidates for each gender.

Fairness Evaluation
I performed fairness evaluation on the test dataset by

  1. Calculating the predicted and actual Disparate Impact, Statistical Parity Difference, Equal Opportunity Difference, and Equalized Odds Difference.
  • I believe there is still a bug for how I am calculating Statistical Parity Difference, so reviewer @a-khaldi, please check for that. Thank you!

@ssaloos ssaloos added the help wanted Extra attention is needed label Nov 9, 2023
@a-khaldi
Copy link
Contributor

a-khaldi commented Nov 10, 2023

Thank you, Sara, for completing the performance and fairness evaluation! As requested, I looked at the Statistical Parity Difference in the final section of the notebook and noticed some errors within the formulas. I did the following:

To calculate the SPD_predicted, we need to use the predicted outcomes of both the men and women, respectively, and find the difference between them.

  • pr_women = prediction_women / (prediction_women + prediction_men)
  • pr_men = prediction_men / (prediction_women + prediction_men)

Furthermore, to calculate the SPD_actual, we have to use the respective variables of all the actual outcomes of both the men and women.

  • ac_women = actual_women / (actual_women + actual_men)
  • ac_men = actual_men / (actual_women + actual_men)

Finally, we can calculate the SPD_predicted and the SPD_actual using the following:

  • SPD_predicted = pr_women - pr_men
  • SPD_actual = ac_women - ac_men

I have reviewed the rest of the notebook and fixed minute errors and bugs, but overall, the work is excellent!

@ssaloos ssaloos changed the title Added Fairness and Performance Evaluations to Jupyter Notebook Added Performance and Fairness Evaluations to Jupyter Notebook Nov 10, 2023
Copy link
Contributor

@Malika1109 Malika1109 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! LGTM!

Copy link

@mahaaaln mahaaaln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed some bugs related to recall and confusion matrix results. I think we're ready to merge!

@Malika1109 Malika1109 merged commit 6b4a0e6 into main Nov 13, 2023
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants