Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge delphi_web_python Docker image from operations repo #1043

Merged
merged 8 commits into from
Dec 6, 2022

Conversation

dshemetov
Copy link
Contributor

@dshemetov dshemetov commented Dec 5, 2022

This is part of #965.

Having our Dockerfiles be partially compiled from the operations repo and partially here introduces unnecessary complexity, with no clear benefits (that I can see).

In the JIT work, I need to bump the pandas version, so moving the Python requirements.txt file from the operations repo here lets me do that in just one place.

I believe this is the last of our three Docker images to depend on the operations repo:

  • delphi_database_epidata is already fully built from dev/docker/database/epidata/Dockerfile in this repo
  • delphi_web_epidata is also fully built from devops/Dockerfile in this repo

This doesn't fully decouple us from operations as the setup.sh script still copies many old repos to the delphi_web_python image. We can consider excising those in a different PR.

Prerequisites:

  • Unless it is a documentation hotfix it should be merged against the dev branch
  • Branch is up-to-date with the branch to be merged with, i.e. dev
  • Build is successful
  • Code is cleaned up and formatted

Summary

This PR:

  • merged the Dockerfile for the delphi_python image into the Dockerfile for the delphi_web_python image in this repo
  • copied the Python requirements.txt file from that repo and merged it in with requirements.dev.txt
  • copied setup.sh file from that repo to dev/docker/python in this repo
  • renamed requirements.txt to requirements.api.txt and sorted dependencies
  • deduplicated packages between requirements.api.txt and requirements.dev.txt and removed some unused ones (like scikit-learn from requirements.dev.txt)
  • pinned package versions in requirements.dev.txt with the latest working versions

@dshemetov dshemetov force-pushed the ds/merge-docker-web-python branch from 609745f to 6d8c123 Compare December 5, 2022 20:05
@melange396
Copy link
Collaborator

nice! you can then remove these lines as well:

@docker build -t delphi_python \
-f repos/delphi/operations/dev/docker/python/Dockerfile .

@melange396
Copy link
Collaborator

melange396 commented Dec 5, 2022

and looks like we couldve/shouldve removed these lines already too:

@docker images delphi_database | grep delphi || \
docker build -t delphi_database -f repos/delphi/operations/dev/docker/database/Dockerfile .

@dshemetov
Copy link
Contributor Author

Good call. Did that and removed those from the CI build too.

@melange396
Copy link
Collaborator

with this PR, we have 3 different "requirements.txt" files in this repo: (ignoring ./src/server/covidcast_issues_migration/requirements.txt from an old one-off thing)

  • ./dev/docker/python/requirements.txt (which you just added/copied here)
  • ./requirements.txt
  • ./requirements.dev.txt

all 3 are used in dev/docker/python/Dockerfile to create delphi_web_python, which is only really used for tests.

only ./requirements.txt (in this repo's root dir) is used to create delphi_web_epidata which runs the api server.

i have a PR (#1046) against this branch. my changes sort the 3 requirements files, and remove duplicated entries from the new one. we should prune down that file some more if we can, maybe even remove it entirely by moving its entries to the other two files.

a caveat is that .github/workflows/update_gdocs_data.yml also uses requirements.dev.txt, and im not sure what it does -- the meat of it seems to be in its line 26 "run: inv update-gdoc" but i cant figure out what that does

@dshemetov
Copy link
Contributor Author

dshemetov commented Dec 5, 2022

update_gdocs_data.yml updates the local db_sources.csv and db_signals.csv files (that the server uses to understand what signals are available) from this Google sheet (and another one elsewhere). see here.

dshemetov added a commit that referenced this pull request Dec 5, 2022
* sorted requirements.txt files
* removed duplicated requirements from ./dev/docker/python/requirements.txt
* reduce runs of "pip install" when creating "delphi_web_python" docker image
* renamed requirements.txt to requirements.api.txt
* merge dev/docker/python/requirements.txt with requirements.dev.txt
* deduplicate packages in requirements.api.txt and requirements.dev.txt
* pinned packages in requirements.dev.txt and removed unused

Co-authored-by: Dmitry Shemetov <[email protected]>
dshemetov and others added 4 commits December 5, 2022 15:37
- merge operations repo delphi_python Dockerfile into delphi_web_python
- copy Python requirements file to this directory
- copy setup.sh to this directory
dshemetov added a commit that referenced this pull request Dec 5, 2022
* sorted requirements.txt files
* removed duplicated requirements from ./dev/docker/python/requirements.txt
* reduce runs of "pip install" when creating "delphi_web_python" docker image
* renamed requirements.txt to requirements.api.txt
* merge dev/docker/python/requirements.txt with requirements.dev.txt
* deduplicate packages in requirements.api.txt and requirements.dev.txt
* pinned packages in requirements.dev.txt and removed unused

Co-authored-by: Dmitry Shemetov <[email protected]>
@dshemetov dshemetov force-pushed the ds/merge-docker-web-python branch from ef63e3b to 08b4589 Compare December 5, 2022 23:39
* sorted requirements.txt files
* removed duplicated requirements from ./dev/docker/python/requirements.txt
* reduce runs of "pip install" when creating "delphi_web_python" docker image
* renamed requirements.txt to requirements.api.txt
* merge dev/docker/python/requirements.txt with requirements.dev.txt
* deduplicate packages in requirements.api.txt and requirements.dev.txt
* pinned packages in requirements.dev.txt and removed unused

Co-authored-by: Dmitry Shemetov <[email protected]>
@dshemetov dshemetov force-pushed the ds/merge-docker-web-python branch from 08b4589 to 8da9b0f Compare December 5, 2022 23:47
@melange396
Copy link
Collaborator

looks pretty good to me:

  • removes a cross-repo dependency
  • should speed up our build process (including ci)
  • generally reduces complexity

my concerns that remain:

  • looks like xlrd and matplotlib can come out of the requirements file(s)?
  • do we wanna add scipy back into the one requirements file? our one reference to it is probably being taken care of by a chained dependency, so its not a huge deal
  • im not sure we need or want to pin so many of those version numbers, but that can stand for the moment as i am soon going to create a github issue about version pinning
  • we may want to "mark for deletion" a number of the things in the operations repo that we dont need anymore (Dockerfile, setup.sh, requirements.txt, etc...)

melange396
melange396 previously approved these changes Dec 6, 2022
@melange396
Copy link
Collaborator

those last two commits by me are intended to improve docker image builds by reducing "layers" which will hopefully shorten build times and shrink image footprints. search for "layer" in https://docs.docker.com/develop/develop-images/dockerfile_best-practices/ for more info.

@dshemetov dshemetov merged commit 97090ca into dev Dec 6, 2022
@dshemetov dshemetov deleted the ds/merge-docker-web-python branch December 6, 2022 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants