Skip to content

Commit

Permalink
Merge pull request #127 from nmfs-opensci/dev
Browse files Browse the repository at this point in the history
remove binder, return to apt2.txt to load pkgs after R installation
  • Loading branch information
eeholmes authored Oct 30, 2024
2 parents d535f8b + 94fc9f0 commit acea3f6
Show file tree
Hide file tree
Showing 5 changed files with 13 additions and 43 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,4 @@ Then add `environment.yml`, `apt.txt` or `install.R` to the repo and those will

The original [py-rocket 1.0](https://github.com/NASA-Openscapes/py-rocket) was developed by Luis Lopez and was built off a Rocker base image. Carl Boettiger and Eli Holmes later altered the image (py-rocket 2.0) so that the Python environment matched the Pangeo image structure but the image was still built off a Rocker image. Subsequently, Carl Boettiger developed [repo2docker-r](https://github.com/boettiger-lab/repo2docker-r) that creates a JupyterHub-compatible image that uses a [Jupyter docker stack image](https://jupyter-docker-stacks.readthedocs.io/en/latest/) as base. For py-rocker 3.0, Eli Holmes used Carl's ideas but used [repo2docker](https://repo2docker.readthedocs.io/en/latest/) and [repo2docker-action](https://github.com/jupyterhub/repo2docker-action) to build the base image. To do this, the [CryoCloud hub image](https://github.com/CryoInTheCloud/hub-image) repo was used for the basic structure and approach. Eli added the `rocker.sh` script and `appendix` modifications to install R and RStudio via the Rocker scripts (rather than using a Rocker image as base). Yuvi Panda (repo2docker) gave input throughout the process as snags were hit.

**Why Rocker for the R/RStudio environment?** The Rocker images are the standard for R/RStudio contanier images. They are heavily tested and regularly updated. There is a large developer community that fixes problems and bugs. The stack has gone through major revisions to improve modularity and they constantly innovating (integration for machine-learning, CUDA, BLAS, spatial, etc., etc.). py-rocker is building off that work without using the images directly. Instead it uses the Dockerfile code and the installation scripts. There are many other approaches to adding R and RStudio to images that work in JupyterHubs. See [repo2docker-r](https://github.com/boettiger-lab/repo2docker-r) that Carl developed and [r-conda](https://github.com/binder-examples/r-conda) for a conda native approach using repo2docker. py-rocket is not intended to create small images; it is intended to create images that emulate Rocker in the `/rstudio` environment on a JupyterHub.
**Why Rocker for the R/RStudio environment?** The Rocker images are the standard for R/RStudio contanier images. They are heavily tested and regularly updated. There is a large developer community that fixes problems and bugs. The stack has gone through major revisions to improve modularity and they constantly innovating (integration for machine-learning, CUDA, BLAS, spatial, etc., etc.). py-rocket is building off that work without using the images directly. Instead it uses the Docker file code and the installation scripts. There are many other approaches to adding R and RStudio to images that work in JupyterHubs. See [repo2docker-r](https://github.com/boettiger-lab/repo2docker-r) that Carl developed and [r-conda](https://github.com/binder-examples/r-conda) for a conda native approach using repo2docker. py-rocket is not intended to create small images; it is intended to create images that emulate Rocker in the `/rstudio` environment on a JupyterHub.
38 changes: 9 additions & 29 deletions appendix
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ USER root

# repo2docker does not set this. This is the default env in repo2docker type images
ENV CONDA_ENV=notebook
# Tell applications where to open desktop apps - this allows notebooks to pop open GUIs
ENV DISPLAY=":1.0"

# Install R, RStudio via Rocker scripts
ENV R_VERSION="4.4.1"
Expand All @@ -13,9 +15,11 @@ RUN PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin && \
chmod +x ${REPO_DIR}/rocker.sh && \
${REPO_DIR}/rocker.sh

# Install any missing build packages. Most are installed when R installed but get any missing packages
RUN apt-get update && \
apt-get install --yes --no-install-recommends build-essential gfortran && \
# Install linux packages after R installation since the R install scripts get rid of packages
# The package_list part is reading the file and doing clean-up to just have the list of packages
RUN package_list=$(grep -v '^\s*#' ${REPO_DIR}/apt2.txt | grep -v '^\s*$' | sed 's/\r//g; s/#.*//; s/^[[:space:]]*//; s/[[:space:]]*$//' | awk '{$1=$1};1') && \
apt-get update && \
apt-get install --yes --no-install-recommends $package_list && \
apt-get autoremove --purge && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
Expand Down Expand Up @@ -44,29 +48,17 @@ ENV NB_USER=${NB_USER}

## ONBUILD section. These commands are run in child Dockerfiles. These are run right after the FROM image is loaded

ONBUILD RUN echo "py-rocket-base: looking for configuration files to use"

ONBUILD USER ${NB_USER}

# ${REPO_DIR} is owned by ${NB_USER}
ONBUILD COPY --chown=${NB_USER}:${NB_USER} . ${REPO_DIR}/childimage

# repo2docker will load files from a .binder or binder directory if
# present. We check if those directories exist.
ONBUILD RUN echo "Checking for 'binder' or '.binder' subfolder" \
; cd "${REPO_DIR}/childimage/" \
; if [ -d binder ] ; then \
echo "Using 'binder/' build context" \
; elif [ -d .binder ] ; then \
echo "Using '.binder/' build context" \
; else \
echo "Using './' build context" \
; fi

# Copy Desktop files into ${REPO_DIR}/Desktop if they exist. start will copy to Application dir and Desktop
# Will not fail if Desktop dir exists but is empty
ONBUILD RUN echo "Checking for 'Desktop directory'..." \
; cd "${REPO_DIR}/childimage/" \
; [ -d binder ] && cd binder \
; [ -d .binder ] && cd .binder \
; if test -d Desktop ; then \
mkdir -p "${REPO_DIR}/Desktop" && \
[ "$(ls -A Desktop 2>/dev/null)" ] && cp -r Desktop/* "${REPO_DIR}/Desktop/"; \
Expand All @@ -77,8 +69,6 @@ ONBUILD RUN echo "Checking for 'Desktop directory'..." \
ONBUILD USER root
ONBUILD RUN echo "Checking for 'apt.txt'..." \
; cd "${REPO_DIR}/childimage/" \
; [ -d binder ] && cd binder \
; [ -d .binder ] && cd .binder \
; if test -f "apt.txt" ; then \
package_list=$(grep -v '^\s*#' apt.txt | grep -v '^\s*$' | sed 's/\r//g; s/#.*//; s/^[[:space:]]*//; s/[[:space:]]*$//' | awk '{$1=$1};1') \
&& apt-get update --fix-missing > /dev/null \
Expand All @@ -93,8 +83,6 @@ ONBUILD USER ${NB_USER}
# sometimes package solving will get rid of pip installed packages. Make sure jupyter-remote-desktop-proxy does not disappear
ONBUILD RUN echo "Checking for 'conda-lock.yml' or 'environment.yml'..." \
; cd "${REPO_DIR}/childimage/" \
; [ -d binder ] && cd binder \
; [ -d .binder ] && cd .binder \
; if test -f "conda-lock.yml" ; then echo "Using conda-lock.yml" & \
conda-lock install --name ${CONDA_ENV} \
&& pip install --no-deps jupyter-remote-desktop-proxy \
Expand All @@ -114,17 +102,13 @@ ONBUILD RUN echo "Checking for 'conda-lock.yml' or 'environment.yml'..." \
# to avoid wasting space.
ONBUILD RUN echo "Checking for pip 'requirements.txt'..." \
; cd "${REPO_DIR}/childimage/" \
; [ -d binder ] && cd binder \
; [ -d .binder ] && cd .binder \
; if test -f "requirements.txt" ; then \
${NB_PYTHON_PREFIX}/bin/pip install --no-cache -r requirements.txt \
; fi

# Add the r packages
ONBUILD RUN echo "Checking for 'install.R" \
; cd "${REPO_DIR}/childimage/" \
; [ -d binder ] && cd binder \
; [ -d .binder ] && cd .binder \
; if test -f "install.R" ; then echo "Using install.R" & \
Rscript install.R \
; fi
Expand All @@ -133,8 +117,6 @@ ONBUILD RUN echo "Checking for 'install.R" \
# This is sourced in ${REPO_DIR}/start
ONBUILD RUN echo "Checking for 'start'..." \
; cd "${REPO_DIR}/childimage/" \
; [ -d binder ] && cd binder && cp start ${REPO_DIR}/childimage/start \
; [ -d .binder ] && cd .binder && cp start ${REPO_DIR}/childimage/start \
; if test -f "start" ; then \
chmod +x start \
; fi
Expand All @@ -145,8 +127,6 @@ ONBUILD RUN echo "Checking for 'start'..." \
# leave behind.
ONBUILD RUN echo "Checking for 'postBuild'..." \
; cd "${REPO_DIR}/childimage/" \
; [ -d binder ] && cd binder \
; [ -d .binder ] && cd .binder \
; if test -f "postBuild" ; then \
chmod +x postBuild \
&& ./postBuild \
Expand Down
3 changes: 2 additions & 1 deletion apt.txt → apt2.txt
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ mc

# Regular build tools for compiling common stuff
# most installed with R installation
# need to install after R so these added to appendix
build-essential
gfortran

# Dependencies for nbconvert and myst
# texlive is installed by R installation
Expand Down
9 changes: 0 additions & 9 deletions postBuild

This file was deleted.

4 changes: 1 addition & 3 deletions start
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,11 @@ set -euo pipefail
# Start - Set any environment variables here
# These are inherited by all processes, *except* RStudio
# USE export <parname>=value
# Tell applications where to open desktop apps - this allows notebooks to pop open GUIs
export DISPLAY=":1.0"
# source this file to get the variables defined in the rocker Dockerfile
# this is a series of export cmds
source ${REPO_DIR}/env.txt
# End - Set any environment variables here


# The for loops will fail if they return null (no files). Set shell option nullglob
shopt -s nullglob

Expand Down

0 comments on commit acea3f6

Please sign in to comment.