Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add chisquare_test.py to assess area weights generated with different targets #267

Merged
merged 20 commits into from
Oct 30, 2024

Conversation

martinholmer
Copy link
Collaborator

@martinholmer martinholmer commented Oct 28, 2024

Using the default number of income tax bins (200), the chi-square test results show that, for each of the ten areas, adding 27 targets (nine counts for all-filing-status by AGI category, nine wage income dollar targets by AGI category, and nine business income dollar targets by AGI category) does not generate a distribution of area weights that is significantly different from the weights distribution by income tax category generated without those extra 27 targets.

Removing the redundant 27 targets reduces the create_area_weights.py execution times by large amounts. The execution time speedup is typically in the range of 4X to 6X.

So, assessing whether or not added targets generate different weights distributions is not only good statistical practice, it can lead to much faster generation of area weights (which is important given that roughly five hundred areas are being processed).

@martinholmer martinholmer marked this pull request as draft October 29, 2024 16:36
@martinholmer martinholmer marked this pull request as ready for review October 30, 2024 16:17
@martinholmer martinholmer merged commit a2cf733 into master Oct 30, 2024
2 checks passed
@martinholmer martinholmer deleted the chisquare-test branch October 30, 2024 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant