Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General Discussion/Conversation #11

Closed
AaryaDesai1 opened this issue Nov 10, 2023 · 11 comments
Closed

General Discussion/Conversation #11

AaryaDesai1 opened this issue Nov 10, 2023 · 11 comments
Assignees
Labels
documentation Improvements or additions to documentation General Discussion

Comments

@AaryaDesai1
Copy link
Contributor

Write all ideas and to-dos here.

@AaryaDesai1 AaryaDesai1 added documentation Improvements or additions to documentation General Discussion labels Nov 10, 2023
@AaryaDesai1
Copy link
Contributor Author

Here are the columns that I think we should include of opioids:

  1. BUYER_STATE
  2. BUYER_ZIP
  3. BUYER_COUNTY
  4. DRUG_NAME (in case we want to do extra EDA and other analysis to see the differences or weights).
  5. TRANSACTION_DATE
  6. CALC_BASE_WT_IN_GM
  7. DOSAGE_UNIT
  8. MME
  9. MME_Conversion_Factor

Please confirm or suggest if this seems okay or if we need to add anything.

@osama-shawir
Copy link
Contributor

Looks good, we can parse the transaction_date into date type variable and continue to filter out datapoints that are outside our timespan of interest.

@jeremymtan
Copy link
Contributor

That's fine with me. We just need to make sure Texas has transaction_dates by month, identify control states to use with Texas, and have those states' transaction_dates by month

@AaryaDesai1
Copy link
Contributor Author

That's fine with me. We just need to make sure Texas has transaction_dates by month, identify control states to use with Texas, and have those states' transaction_dates by month

Good call.

For the year range, I think we should take a good range which would be a few years before the earliest policy implement and then a few years after the latest implementation (irrespective of the states). This comes up to around 2002 to 2017-18.

Also I needed to ask how you would like to pick the control states. There are a few approaches we could take:

  1. Geographical location. For this, we could choose Cali and Oregon for Washington, Arizona and Colorado for Texas and Georgia and NC for Florida.
  2. Demographics. For this, we would have to look into the census data for each state and then compare what states look like the best competitors for the states of interest.
  3. Trends. We could look at the opioid consumption and mortality trends for up to the year of policy implementation and then choose the states that are the most similar to the states we've chosen (this is based on the graph in Nick's pdf).
  4. Comparison of policy implementation. This will take some time to research, but we could look into what states did NOT implement opioid control policies and then use them as controls. This would also need the above implementation.

@AaryaDesai1 AaryaDesai1 pinned this issue Nov 10, 2023
@AaryaDesai1
Copy link
Contributor Author

Hey guys, Im working on the mortality dataset right now and it's a combination of alcohol/drug induced deaths. The project PDF summary doesn't say anything about removing alcohol-related deaths and only maintaining drug-induced deaths but I think we should subset it to only the relevant data (i.e., drug induced deaths). What do you guys think?

@jeremymtan
Copy link
Contributor

Naive approach of finding # of year based on parquet file doesn't work as each parquet file of each state has years from 2006-2017. We need to do another approach, which I believe might work:

  • figure out how we calculate opiod shipments
  • calculate the population for each county for that year (need Osama's data for this so need to merge)
  • then calculate opiod shipments per capita
  • find states with similar slopes before the policy implementation and those can be our control states

@AaryaDesai1
Copy link
Contributor Author

So we would have to do the slope calculation on each parquet file (and by extension, we'd have to do 50 merges)?

@jeremymtan
Copy link
Contributor

Unless I can find a better way to do this, there would be 50 merges but the datasets are small enough to find them quickly.

@AaryaDesai1
Copy link
Contributor Author

Yeah, sounds good. Im working on the Florida merge rn. So we could just iterate that code over the rest of the states. Im just smoothing it out right now cuz there are a few problems with it :))

@AaryaDesai1
Copy link
Contributor Author

By the way, this is a study I found about policy implementation (along with a timeline of when states implemented policies), in case we just wanna choose control states based on literature and not based on metrics:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9914561/

@osama-shawir
Copy link
Contributor

Florida merge completed, we can replicate the process for other treatment states and proceed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation General Discussion
Projects
None yet
Development

No branches or pull requests

3 participants