Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update DSCI 100 R and Python Docker Images #38

Merged
merged 6 commits into from
Jan 3, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docker/py-dsci-100/.gitconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[pull]
rebase = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we change the pull strategy? I wouldn't normally go for rebase -- any particular reason?

Copy link
Contributor Author

@briank-git briank-git Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find the original slack thread where it was suggested, but no, there's no particular reason. We can use any other strategy. Maybe ff only?

It's just to suppress this error which prompts students to choose a pull strategy (have to use terminal):

Hint: You have divergent branches and need to specify how to reconcile them.
Hint: You can do so by running one of the following commands sometime before
Hint: your next pull:
Hint: 
Hint:   git config pull.rebase false  # merge
Hint:   git config pull.rebase true   # rebase
Hint:   git config pull.ff only       # fast-forward only

Copy link
Contributor

@trevorcampbell trevorcampbell Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see! It's not for this repo, it's for copying into the docker image later on to avoid students getting confusing error messages. Let's use the default strategy which is pull.rebase false (so that what they see in class is the common default in the real world)

25 changes: 17 additions & 8 deletions docker/py-dsci-100/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,6 @@ WORKDIR "${HOME}/work"
# https://discourse.jupyter.org/t/debugger-warning-it-seems-that-frozen-modules-are-being-used-python-3-11-0/16544/12
ENV PYDEVD_DISABLE_FILE_VALIDATION=1

# update scikit-learn to the current @main (July 20, 2023)
# to get this merged PR: https://github.com/scikit-learn/scikit-learn/pull/26772
# which fixes this issue: https://github.com/scikit-learn/scikit-learn/issues/26768
RUN pip install --upgrade git+https://github.com/scikit-learn/scikit-learn@7de59b2017f39048880a3858d32c8caef9308954
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What installs scikit-learn now that this is gone?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and does the latest release include the two fixes listed in the comments?)

Copy link
Contributor Author

@briank-git briank-git Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scikit-learn 1.3.1 is installed by the base image jupyter/scipy-notebook. Here's a link to the relevant patch notes for 1.3.1, fixed by PR #26772.


# Install Python packages
RUN mamba install --quiet --yes \
'numpy' \
Expand All @@ -29,15 +24,16 @@ RUN mamba install --quiet --yes \
'ibis-framework' \
'nodejs' \
'psycopg2' \
'jupyter-resource-usage' \
&& mamba clean --all -f -y \
&& fix-permissions "${CONDA_DIR}" \
&& fix-permissions "/home/${NB_USER}"

# Install some recent python package that might not be on conda yet
RUN pip install pandas"<2.1" altair"==5.0.1" "vegafusion[embed]" vl-convert-python

# Install nbgitpuller, jlab-git
RUN pip install nbgitpuller jupyterlab-git \
# Install nbgitpuller, jlab-git, newest pexpect 4.9.0
RUN pip install nbgitpuller jupyterlab-git==0.50.0 pexpect==4.9.0 \
&& jupyter lab build

# Disable the cell toolbar (which ignores metadata and students often accidentally click + delete grading cells)
Expand All @@ -46,7 +42,20 @@ RUN jupyter labextension disable @jupyterlab/cell-toolbar-extension
# Install the extension to remove the shift+M merge shortcut
COPY shortcuts.jupyterlab-settings /home/${NB_USER}/.jupyter/lab/user-settings/\@jupyterlab/shortcuts-extension/shortcuts.jupyterlab-settings

# Make sure everything in the home folder is owned by NB_USER
# Copy jupyter_server_config.py which allows students to see and delete hidden files
COPY jupyter_server_config.py /home/${NB_USER}/.jupyter

# Copy gitconfig that sets global default pull strategy to rebase
COPY .gitconfig /home/${NB_USER}/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briank-git please edit the commented out line on 48 above to reflect the default merge strat


USER root
# Install zip package
RUN apt install zip

# Make sure everything in the home folder is owned by NB_USER
RUN chown -R ${NB_USER} /home/${NB_USER}

USER ${NB_UID}

#Disable healthcheck for performance reasons
HEALTHCHECK NONE
49 changes: 49 additions & 0 deletions docker/py-dsci-100/jupyter_server_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Configuration file for jupyter-server.

c = get_config() #noqa

#------------------------------------------------------------------------------
# ContentsManager(LoggingConfigurable) configuration
#------------------------------------------------------------------------------
## Base class for serving files and directories.
#
# This serves any text or binary file,
# as well as directories,
# with special handling for JSON notebook documents.
#
# Most APIs take a path argument,
# which is always an API-style unicode path,
# and always refers to a directory.
#
# - unicode, not url-escaped
# - '/'-separated
# - leading and trailing '/' will be stripped
# - if unspecified, path defaults to '',
# indicating the root path.

## Allow access to hidden files
# Default: False
c.ContentsManager.allow_hidden = True

#------------------------------------------------------------------------------
# FileContentsManager(FileManagerMixin, ContentsManager) configuration
#------------------------------------------------------------------------------
## Allow access to hidden files
# See also: ContentsManager.allow_hidden
c.FileContentsManager.allow_hidden = True

#------------------------------------------------------------------------------
# AsyncContentsManager(ContentsManager) configuration
#------------------------------------------------------------------------------
## Base class for serving files and directories asynchronously.

## Allow access to hidden files
# See also: ContentsManager.allow_hidden
c.AsyncContentsManager.allow_hidden = True

#------------------------------------------------------------------------------
# AsyncFileContentsManager(FileContentsManager, AsyncFileManagerMixin, AsyncContentsManager) configuration
#------------------------------------------------------------------------------
## Allow access to hidden files
# See also: ContentsManager.allow_hidden
c.AsyncFileContentsManager.allow_hidden = True
2 changes: 2 additions & 0 deletions docker/r-dsci-100/.gitconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[pull]
rebase = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same Q -- why changing the repo default to rebase?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled by above discussion

13 changes: 11 additions & 2 deletions docker/r-dsci-100/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ RUN mamba install --quiet --yes \
'r-kknn' \
'r-testthat' \
'r-rpostgres' \
'jupyter-resource-usage' \
&& mamba clean --all -f -y \
&& fix-permissions "${CONDA_DIR}" \
&& fix-permissions "/home/${NB_USER}" \
Expand All @@ -32,8 +33,8 @@ RUN mamba install --quiet --yes \
&& Rscript -e "install.packages('tidyclust', repos='http://cran.us.r-project.org')" \
&& Rscript -e "install.packages('janitor', repos='http://cran.us.r-project.org')"

# Install nbgitpuller, jlab-git
RUN pip install nbgitpuller jupyterlab-git \
# Install nbgitpuller, jlab-git, newest pexpect 4.9.0
RUN pip install nbgitpuller jupyterlab-git pexpect==4.9.0 \
&& jupyter lab build

# Disable the cell toolbar (which ignores metadata and students often accidentally click + delete grading cells)
Expand All @@ -42,8 +43,16 @@ RUN jupyter labextension disable @jupyterlab/cell-toolbar-extension
# Install the extension to remove the shift+M merge shortcut
COPY shortcuts.jupyterlab-settings /home/${NB_USER}/.jupyter/lab/user-settings/\@jupyterlab/shortcuts-extension/shortcuts.jupyterlab-settings

# Copy jupyter_server_config.py which allows students to see and delete hidden files
COPY jupyter_server_config.py /home/${NB_USER}/.jupyter

# Copy gitconfig that sets global default pull strategy to rebase
COPY .gitconfig /home/${NB_USER}/

# Make sure everything in the home folder is owned by NB_USER
USER root
RUN chown -R ${NB_USER} /home/${NB_USER}
USER ${NB_UID}

# Disable healthcheck for performance reasons
HEALTHCHECK NONE
49 changes: 49 additions & 0 deletions docker/r-dsci-100/jupyter_server_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Configuration file for jupyter-server.

c = get_config() #noqa

#------------------------------------------------------------------------------
# ContentsManager(LoggingConfigurable) configuration
#------------------------------------------------------------------------------
## Base class for serving files and directories.
#
# This serves any text or binary file,
# as well as directories,
# with special handling for JSON notebook documents.
#
# Most APIs take a path argument,
# which is always an API-style unicode path,
# and always refers to a directory.
#
# - unicode, not url-escaped
# - '/'-separated
# - leading and trailing '/' will be stripped
# - if unspecified, path defaults to '',
# indicating the root path.

## Allow access to hidden files
# Default: False
c.ContentsManager.allow_hidden = True

#------------------------------------------------------------------------------
# FileContentsManager(FileManagerMixin, ContentsManager) configuration
#------------------------------------------------------------------------------
## Allow access to hidden files
# See also: ContentsManager.allow_hidden
c.FileContentsManager.allow_hidden = True

#------------------------------------------------------------------------------
# AsyncContentsManager(ContentsManager) configuration
#------------------------------------------------------------------------------
## Base class for serving files and directories asynchronously.

## Allow access to hidden files
# See also: ContentsManager.allow_hidden
c.AsyncContentsManager.allow_hidden = True

#------------------------------------------------------------------------------
# AsyncFileContentsManager(FileContentsManager, AsyncFileManagerMixin, AsyncContentsManager) configuration
#------------------------------------------------------------------------------
## Allow access to hidden files
# See also: ContentsManager.allow_hidden
c.AsyncFileContentsManager.allow_hidden = True