Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed fix for reporting and missing column unexpected behavior #150

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ssciolist
Copy link
Contributor

Issue 149
This code could be merged on its own to make the 'reported' and 'missing' columns refer to what was reported and missing for UCLA, rather than the current behavior. The current behavior counts a state as reporting if there is either data for UCLA or data from the Marshall Project from June. However...

If we move forward with removing references to the Marshall Project, the desired behavior (reported and missing both refer to UCLA's access to data) will happen on its own. OR if we move the global cutoff date to December 2021, then the behavior goes away on its own. BUT both options here will lower the cumulative counts substantially. I have to redo my comparison to be specific, but one thing I remember is that deaths go down by 233, for instance.

I think the least disruptive change to our data users would be to have the max ever cumulative figures available before either changing the cutoff date or removing the Marshall Project. But that's totally different work whose time requirement we don't know. So I think in the end, it may be best to merge this patch, but there's a good argument to be made for just dealing with the hit to cumulative figures too. Food for thought.

# Aggregate by Val, including MP, then add the reporting and missing
out_agg_df <- agg_df %>%
summarize(Count = sum_na_rm(Val)) %>%
merge(UCLA_reporting_and_missing)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny suggestion -- if you use left_join here rather than merge, the object created is a tibble which makes things print more nicely in the terminal! Not necessary but might be nice to have

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it!

@hjohns12
Copy link
Contributor

This looks great! The only change I noticed and didn't expect to see was Residents.Confirmed has Wyoming as newly missing. Is this expected behavior? I thought we were collected that measure for Wyoming but I could be totally mis-remembering!

@ssciolist
Copy link
Contributor Author

Thanks for your feedback Hope! I incorporated the left_join and I checked on Wyoming, and we do collect Residents.Active in the manual sheets, but we don't collect confirmed/cumulative so I believe this is right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants