-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add butcher package #128
base: main
Are you sure you want to change the base?
Add butcher package #128
Conversation
Added with: just add-package [email protected] https://packagemanager.posit.co/cran/2021-06-29 Fixes #127
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not sure about supporting and using different repos to get packages - it's a policy change for the R image. We've previously rejected requests for packages because they're not on CRAN.
It may be a reasonable policy change to make, but if we do, then what repos are allowed? Is github allowed as a source?
I think the posit repo is maybe just a frozen cran mirror, so maybe its ok? I am wary of date based pinning of older versions. I actually think the first suggestion of adding dependencies manually is cleaner from a maintenance pov |
Using snapshot repos is the easiest way to add a new package without updating its dependencies that I know of. If a new package had a lot of dependencies, working out the order and version in which to include those could take some figuring out. So for me this looks good. That's right, the RStudio/Posit CRAN snapshots are CRAN at a certain date, like the Microsoft CRAN snapshot repos we used to add dtwclust and randomForest. Lines 106 to 107 in 2f2498a
Sadly, soon Microsoft are shutting down their CRAN snapshotting service, the RStudio/Posit snapshot repos are the only public large scale snapshotting service left that I know of. I gave the URL to the Posit source package snapshot repo rather than the URL to their Focal binary package snapshot repo because there's a chance they might have used different system libraries. As we chatted about before the most recent syntax additions to R came in R 4.1.0, so if you were to do a new image, say with R 4.1.3 that could include updated versions of the packages. My guess is that for probably all the packages their current CRAN version will install into R 4.1.3. If there are any that require say R 4.2.0 then you'd have to go back a version or two for those packages. |
|
||
export PACKAGE="$1" | ||
if [ "$2" = "NULL" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just specify the default value of repos as NULL
e.g. remove line 4, and replace 7-9 with:
REPOS=${2:-NULL}
And rather than add escape quotes in bash, I would just add them in the R fragment, as per suggested change. Then we're consisently handling package and repos quoting in the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to add a default but I suspect that the original non-escaping of the quotes around $REPOS
is the better way to go because in R NULL
is null, whereas "NULL"
is not null, i.e.
is.null(NULL)
#> [1] TRUE
is.null("NULL")
#> [1] FALSE
# install the package using the cache | ||
RUN --mount=type=cache,target=/cache,id=/cache-2004 bash -c "R -e 'renv::activate(); renv::install(\"$PACKAGE\"); renv::snapshot(type=\"all\")'" | ||
RUN --mount=type=cache,target=/cache,id=/cache-2004 bash -c "R -e 'renv::activate(); renv::install(\"$PACKAGE\", repos = $REPOS); renv::snapshot(type=\"all\")'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RUN --mount=type=cache,target=/cache,id=/cache-2004 bash -c "R -e 'renv::activate(); renv::install(\"$PACKAGE\", repos = $REPOS); renv::snapshot(type=\"all\")'" | |
RUN --mount=type=cache,target=/cache,id=/cache-2004 bash -c "R -e 'renv::activate(); renv::install(\"$PACKAGE\", repos = \"$REPOS\"); renv::snapshot(type=\"all\")'" |
"Package": "lobstr", | ||
"Version": "1.1.1", | ||
"Source": "Repository", | ||
"Repository": "RSPM", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This confuses me a bit. Presumably you passing in https://packagemanager.posit.co/cran/2021-06-29
as repo when adding butcher, as per Tom's issue. But what we end up with is just "RSPM". We seem to have lost the "https://packagemanager.posit.co/cran/2021-06-29" url somehow. So I suspect this won't build on another system that doesn't have the build cached.
Naively, I would have expected a new "RSPM" entry in the "Repositories" section of renv.lock, but this doesn't seem to be in this commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great spot that there isn't an entry for RSPM in "Repositories" in renv.lock (RSPM = RStudio Package Manager a.k.a. Posit Package Manager). This turns out to be ok because CRAN is listed in renv.lock and renv::restore()
falls back to that, so it will restore when lobstr not in the cache.
Here's a screenshot of the output from my test example, which is here
Ok, I yield to your in depth R ecosystem knowledge :)
Yep, my vague plan had been to install all the same packages as this image, but at whatever version is current when we initially publish the image. And then sunset this image, freezing the package additions, and only supporting new packages on the newer image. |
See #127 for details.