Scope of {admiraldata} #1980
Replies: 8 comments 49 replies
-
From a conversation with @rossfarrugia: Perhaps the best avenue would be to simply provide both SDTMs and ADaMs in {admiraldata}. The ADaMs would be generated by the templates stored in {admiral} and then saved in {admiraldata}'s data folder. It would also be nice for users to be able to load in test ADaMs, as well as test SDTMs simply by loading in {admiraldata}. A couple of considerations:
@pharmaverse/admiral @pharmaverse/admiraldata what are your thoughts? |
Beta Was this translation helpful? Give feedback.
-
I think if we include some ADaMs in {admiraldata}, then we should include all ADaMs; if we need multiple copies then they can be prefixed with admiral_, admiralonco_, etc |
Beta Was this translation helpful? Give feedback.
-
Maybe the {admiraldata} package mantainer could do a quarterly "rerun" of all template scripts prior to release? So that you capture any updates made during that development increment? |
Beta Was this translation helpful? Give feedback.
-
I don't understand the rationale for wanting to store ADaM data sets? If we want to keep the new repo {admiraldata} just for SDTMs, we could store ADSL on admiral {core}, as its required for admiral Vignettes (is it just Vignetes??). |
Beta Was this translation helpful? Give feedback.
-
It seems that we have three or four use cases for the test data:
I wonder if it makes sense to cover all use cases by a single set of datasets. For admiral vignettes about five patients should be sufficient. In this case most values can be created explicitly. For NEST I would expect that 20-50 patients are required. Most likely values for these patients are generated randomly. Creating values randomly such that they are realistic and show the desired scenario is not easy. Why should I spend a lot of time generating data for 50 patients if I need only five for writing a vignette? Furthermore, how do we avoid that updates for one use case does not spoil another one? For example an update of the data for NEST could change admiral vignettes. |
Beta Was this translation helpful? Give feedback.
-
hi team, just added an extra use case that it would be great to have our Pharmaverse examples (https://github.com/pharmaverse/examples) work off this test data so we have e2e examples of different pharmaverse packages working together over the same test data. I'm really keen here that we take this opportunity to think wider than only admiral and approach it with a one pharmaverse mindset. after all, if it wasn't for NEST team we never would have had @cicdguy and his team who have contributed so much to admiral, so a great chance for us to give back to benefit wider packages. Having followed all of the discussions above I suggest we create the following:
The good thing about using {admiral.test} as the starting point is this does use a real anonymised study CDISC made available so we know the test data is realistic, and I think it has sufficient patients and data for all our needs. @manciniedoardo & @bms63 i'd suggest you guys meet 1:1 though to conclude on all the great discussions above and set the strategic direction for our team now. (and i wont be offended if you ignore my suggestions!) |
Beta Was this translation helpful? Give feedback.
-
Decision from the standup meeting discussion: go with @rossfarrugia 's suggestion, with the modification that we store vignette datasets that are smaller than the test datasets within the admiral package. This avoids revere/circular dependencies. We would however make sure not to export the vignette datasets so that end-users don't see them, and only turn to pharmaversesdtm/pharmaverseadam for their data needs. |
Beta Was this translation helpful? Give feedback.
-
Hmm yeah you're right. In that case we probably can't completely hide the data from users, so I would suggest storing the data in the |
Beta Was this translation helpful? Give feedback.
-
The initial idea for {admiraldata} was for it to contain/collect all the SDTM test data from the various {admiral packages}. However a good point was raised in a standup that this doesn't quite work in practice, as, for instance, the programs to generate some of the {admiralonco} test data require
ADSL
. Thus for them to be ran within {admiraldata} requires a copy ofADSL
in the package too.How should {admiraldata} approach this problem? Should we have some ADaM datasets in the package as well? Should we have all ADaM datasets, ie should all the data for all the admiral packages live within {admiraldata} only? Or would this make the package too large?
@pharmaverse/admiral thoughts?
Beta Was this translation helpful? Give feedback.
All reactions