How best to store and reference the required growth charts metadata from CDC and WHO in our package? #13
Replies: 5 comments 8 replies
-
For the referencing point, after email discussion with David I'm thinking best to refer in our package documentation as: For the storage of metafiles I generally prefer recreating in code rather than storing datasets as part of the package, but interested to hear other's views. |
Beta Was this translation helpful? Give feedback.
-
If people want to use the CSV files I created, I'm fine with that. All I did was combine info from the CDC or WHO reference files. If anyone wants to re-create the reference files, the reference data for the CDC charts are at For WHO, the reference data are at One question with the WHO charts is whether people want to use weight-for-length or BMI-for age for infants. The official CDC guidance is to use weight-for-length, but this will result in some missing data for kids who have a length outside the range in the WHO data. There's a least 1 paper I know of that concludes that BMI-for-age is better, but I think the 2 metrics are very similar. Both the CDC and WHO reference files are very small. The CDCref_d.csv has 1s and 2s at the end - this is because 20 years ago, I didn't know how to interpolate in SAS. The LMS values in the CDC growth charts are for each month of age, but I wanted to interpolate in case people had more accurate information on age. |
Beta Was this translation helpful? Give feedback.
-
Sorry - I should have chimed in on the question about recreating the data files or storing them in the package. While it's true that the reference data won't change, their URL (particularly for WHO) and the names of the files change every few years (when someone tries to fix the poor organization of the WHO or CDC websites). I'm not sure if this would affect 'recreating in code'. |
Beta Was this translation helpful? Give feedback.
-
Based on reading the comments above, there seems to be many different datasets that could be used (correct me if there aren't more than a handful). That may warrant deploying datasets as code like Ross said, but also having a function that dispatches datasets based on a convention, like a string, using |
Beta Was this translation helpful? Give feedback.
-
e.g. We could store them direct in the package as R datasets or maybe instead offer functions that recreate the metadata as a dataset via tibble (then users would run this first before using as input to their ADaM)?
We would need to make clear in our documentation the versions of the input metadata used so that we could update in future releases if ever these source files evolve in future as happened before with: "2022 CDC Extended BMI-for-Age Growth Charts for Children and Adolescents With Very High BMIs"
Beta Was this translation helpful? Give feedback.
All reactions