Skip to content

Commit

Permalink
fIX handling removal of dependencies (apache#42967)
Browse files Browse the repository at this point in the history
There was a problem with CI image builds when cache has been
disabled - they still used "pre-cached" main version (unlike
PROD builds). This PR synchronizes the behaviour between CI
and PROD builds.

For removing dependencies, you need to set both:

* `disable docker cache` label
* incerase DEPENDENCIES_EPOCH_NUMBER in `Dockerfile.ci`

Comments and documentation in both places has been updated to
reflect it.

Since documentation for labels has been updated,
Part of this PR is to improve the description of possible
labels that could be used during the build.
The description grew from a small number of labels to a
"wall of text" that was difficult to read. This PR reformats it in the
form of table that makes it far easier to see actions that the
maintainer can do and what labels should be set for each of them.
  • Loading branch information
potiuk authored Oct 13, 2024
1 parent 8b1c57a commit 51f4302
Show file tree
Hide file tree
Showing 23 changed files with 179 additions and 114 deletions.
7 changes: 7 additions & 0 deletions .github/workflows/additional-ci-image-checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,10 @@ on: # yamllint disable-line rule:truthy
description: "Docker cache specification to build the image (registry, local, disabled)."
required: true
type: string
disable-airflow-repo-cache:
description: "Disable airflow repo cache read from main."
required: true
type: string
canary-run:
description: "Whether this is a canary run (true/false)"
required: true
Expand Down Expand Up @@ -112,6 +116,7 @@ jobs:
use-uv: "true"
include-success-outputs: ${{ inputs.include-success-outputs }}
docker-cache: ${{ inputs.docker-cache }}
disable-airflow-repo-cache: ${{ inputs.disable-airflow-repo-cache }}
if: inputs.branch == 'main'

# Check that after earlier cache push, breeze command will build quickly
Expand Down Expand Up @@ -168,3 +173,5 @@ jobs:
# use-uv: "true"
# upgrade-to-newer-dependencies: ${{ inputs.upgrade-to-newer-dependencies }}
# docker-cache: ${{ inputs.docker-cache }}
# disable-airflow-repo-cache: ${{ inputs.disable-airflow-repo-cache }}
#
6 changes: 6 additions & 0 deletions .github/workflows/additional-prod-image-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,10 @@ on: # yamllint disable-line rule:truthy
description: "Docker cache specification to build the image (registry, local, disabled)."
required: true
type: string
disable-airflow-repo-cache:
description: "Disable airflow repo cache read from main."
required: true
type: string
canary-run:
description: "Whether to run the canary run (true/false)"
required: true
Expand All @@ -72,6 +76,7 @@ jobs:
chicken-egg-providers: ${{ inputs.chicken-egg-providers }}
constraints-branch: ${{ inputs.constraints-branch }}
docker-cache: ${{ inputs.docker-cache }}
disable-airflow-repo-cache: ${{ inputs.disable-airflow-repo-cache }}
if: inputs.default-branch == 'main' && inputs.canary-run == 'true'

prod-image-extra-checks-release-branch:
Expand All @@ -89,6 +94,7 @@ jobs:
chicken-egg-providers: ${{ inputs.chicken-egg-providers }}
constraints-branch: ${{ inputs.constraints-branch }}
docker-cache: ${{ inputs.docker-cache }}
disable-airflow-repo-cache: ${{ inputs.disable-airflow-repo-cache }}
if: inputs.default-branch != 'main' && inputs.canary-run == 'true'

test-examples-of-prod-image-building:
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/build-images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ jobs:
prod-image-build: ${{ steps.selective-checks.outputs.prod-image-build }}
docker-cache: ${{ steps.selective-checks.outputs.docker-cache }}
default-branch: ${{ steps.selective-checks.outputs.default-branch }}
disable-airflow-repo-cache: ${{ steps.selective-checks.outputs.disable-airflow-repo-cache }}
constraints-branch: ${{ steps.selective-checks.outputs.default-constraints-branch }}
runs-on-as-json-default: ${{ steps.selective-checks.outputs.runs-on-as-json-default }}
runs-on-as-json-public: ${{ steps.selective-checks.outputs.runs-on-as-json-public }}
Expand Down Expand Up @@ -210,6 +211,8 @@ jobs:
constraints-branch: ${{ needs.build-info.outputs.constraints-branch }}
upgrade-to-newer-dependencies: ${{ needs.build-info.outputs.upgrade-to-newer-dependencies }}
docker-cache: ${{ needs.build-info.outputs.docker-cache }}
disable-airflow-repo-cache: ${{ needs.build-info.outputs.disable-airflow-repo-cache }}


generate-constraints:
name: Generate constraints
Expand Down Expand Up @@ -256,3 +259,4 @@ jobs:
upgrade-to-newer-dependencies: ${{ needs.build-info.outputs.upgrade-to-newer-dependencies }}
chicken-egg-providers: ${{ needs.build-info.outputs.chicken-egg-providers }}
docker-cache: ${{ needs.build-info.outputs.docker-cache }}
disable-airflow-repo-cache: ${{ needs.build-info.outputs.disable-airflow-repo-cache }}
5 changes: 5 additions & 0 deletions .github/workflows/ci-image-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,10 @@ on: # yamllint disable-line rule:truthy
description: "Docker cache specification to build the image (registry, local, disabled)."
required: true
type: string
disable-airflow-repo-cache:
description: "Disable airflow repo cache read from main."
required: true
type: string
jobs:
build-ci-images:
strategy:
Expand Down Expand Up @@ -171,6 +175,7 @@ ${{ inputs.do-build == 'true' && inputs.image-tag || '' }}"
--python "${{ matrix.python-version }}" --platform "${{ inputs.platform }}"
env:
DOCKER_CACHE: ${{ inputs.docker-cache }}
DISABLE_AIRFLOW_REPO_CACHE: ${{ inputs.disable-airflow-repo-cache }}
INSTALL_MYSQL_CLIENT_TYPE: ${{ inputs.install-mysql-client-type }}
UPGRADE_TO_NEWER_DEPENDENCIES: ${{ inputs.upgrade-to-newer-dependencies }}
# You can override CONSTRAINTS_GITHUB_REPOSITORY by setting secret in your repo but by default the
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ jobs:
outputs:
image-tag: ${{ github.event.pull_request.head.sha || github.sha }}
docker-cache: ${{ steps.selective-checks.outputs.docker-cache }}
disable-airflow-repo-cache: ${{ steps.selective-checks.outputs.disable-airflow-repo-cache }}
affected-providers-list-as-string: >-
${{ steps.selective-checks.outputs.affected-providers-list-as-string }}
upgrade-to-newer-dependencies: ${{ steps.selective-checks.outputs.upgrade-to-newer-dependencies }}
Expand Down Expand Up @@ -208,6 +209,7 @@ jobs:
upgrade-to-newer-dependencies: ${{ needs.build-info.outputs.upgrade-to-newer-dependencies }}
constraints-branch: ${{ needs.build-info.outputs.default-constraints-branch }}
docker-cache: ${{ needs.build-info.outputs.docker-cache }}
disable-airflow-repo-cache: ${{ needs.build-info.outputs.disable-airflow-repo-cache }}

wait-for-ci-images:
timeout-minutes: 120
Expand Down Expand Up @@ -264,6 +266,7 @@ jobs:
upgrade-to-newer-dependencies: ${{ needs.build-info.outputs.upgrade-to-newer-dependencies }}
skip-pre-commits: ${{ needs.build-info.outputs.skip-pre-commits }}
docker-cache: ${{ needs.build-info.outputs.docker-cache }}
disable-airflow-repo-cache: ${{ needs.build-info.outputs.disable-airflow-repo-cache }}
canary-run: ${{ needs.build-info.outputs.canary-run }}
latest-versions-only: ${{ needs.build-info.outputs.latest-versions-only }}
include-success-outputs: ${{ needs.build-info.outputs.include-success-outputs }}
Expand Down Expand Up @@ -559,6 +562,7 @@ jobs:
chicken-egg-providers: ${{ needs.build-info.outputs.chicken-egg-providers }}
constraints-branch: ${{ needs.build-info.outputs.default-constraints-branch }}
docker-cache: ${{ needs.build-info.outputs.docker-cache }}
disable-airflow-repo-cache: ${{ needs.build-info.outputs.disable-airflow-repo-cache }}

wait-for-prod-images:
timeout-minutes: 80
Expand Down Expand Up @@ -615,6 +619,7 @@ jobs:
upgrade-to-newer-dependencies: ${{ needs.build-info.outputs.upgrade-to-newer-dependencies }}
chicken-egg-providers: ${{ needs.build-info.outputs.chicken-egg-providers }}
docker-cache: ${{ needs.build-info.outputs.docker-cache }}
disable-airflow-repo-cache: ${{ needs.build-info.outputs.disable-airflow-repo-cache }}
default-python-version: ${{ needs.build-info.outputs.default-python-version }}
canary-run: ${{ needs.build-info.outputs.canary-run }}
if: needs.build-info.outputs.prod-image-build == 'true'
Expand Down Expand Up @@ -670,6 +675,7 @@ jobs:
upgrade-to-newer-dependencies: ${{ needs.build-info.outputs.upgrade-to-newer-dependencies }}
include-success-outputs: ${{ needs.build-info.outputs.include-success-outputs }}
docker-cache: ${{ needs.build-info.outputs.docker-cache }}
disable-airflow-repo-cache: ${{ needs.build-info.outputs.disable-airflow-repo-cache }}
canary-run: ${{ needs.build-info.outputs.canary-run }}

notify-slack-failure:
Expand Down
5 changes: 5 additions & 0 deletions .github/workflows/finalize-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,10 @@ on: # yamllint disable-line rule:truthy
description: "Docker cache specification to build the image (registry, local, disabled)."
required: true
type: string
disable-airflow-repo-cache:
description: "Disable airflow repo cache read from main."
required: true
type: string
include-success-outputs:
description: "Whether to include success outputs (true/false)"
required: true
Expand Down Expand Up @@ -148,6 +152,7 @@ jobs:
use-uv: "true"
include-success-outputs: ${{ inputs.include-success-outputs }}
docker-cache: ${{ inputs.docker-cache }}
disable-airflow-repo-cache: ${{ inputs.disable-airflow-repo-cache }}
if: inputs.canary-run == 'true'

# push-buildx-cache-to-github-registry-arm:
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/prod-image-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,10 @@ on: # yamllint disable-line rule:truthy
description: "Docker cache specification to build the image (registry, local, disabled)."
required: true
type: string
disable-airflow-repo-cache:
description: "Disable airflow repo cache read from main."
required: true
type: string
jobs:

build-prod-packages:
Expand Down Expand Up @@ -276,6 +280,7 @@ ${{ inputs.do-build == 'true' && inputs.image-tag || '' }}"
env:
PUSH: ${{ inputs.push-image }}
DOCKER_CACHE: ${{ inputs.docker-cache }}
DISABLE_AIRFLOW_REPO_CACHE: ${{ inputs.disable-airflow-repo-cache }}
DEBIAN_VERSION: ${{ inputs.debian-version }}
INSTALL_MYSQL_CLIENT_TYPE: ${{ inputs.install-mysql-client-type }}
UPGRADE_TO_NEWER_DEPENDENCIES: ${{ inputs.upgrade-to-newer-dependencies }}
Expand All @@ -291,6 +296,7 @@ ${{ inputs.do-build == 'true' && inputs.image-tag || '' }}"
env:
PUSH: ${{ inputs.push-image }}
DOCKER_CACHE: ${{ inputs.docker-cache }}
DISABLE_AIRFLOW_REPO_CACHE: ${{ inputs.disable-airflow-repo-cache }}
DEBIAN_VERSION: ${{ inputs.debian-version }}
INSTALL_MYSQL_CLIENT_TYPE: ${{ inputs.install-mysql-client-type }}
UPGRADE_TO_NEWER_DEPENDENCIES: ${{ inputs.upgrade-to-newer-dependencies }}
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/prod-image-extra-checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,10 @@ on: # yamllint disable-line rule:truthy
description: "Docker cache specification to build the image (registry, local, disabled)."
required: true
type: string
disable-airflow-repo-cache:
description: "Disable airflow repo cache read from main."
required: true
type: string
jobs:
myssql-client-image:
uses: ./.github/workflows/prod-image-build.yml
Expand All @@ -84,6 +88,7 @@ jobs:
chicken-egg-providers: ${{ inputs.chicken-egg-providers }}
constraints-branch: ${{ inputs.constraints-branch }}
docker-cache: ${{ inputs.docker-cache }}
disable-airflow-repo-cache: ${{ inputs.disable-airflow-repo-cache }}

pip-image:
uses: ./.github/workflows/prod-image-build.yml
Expand All @@ -107,3 +112,4 @@ jobs:
chicken-egg-providers: ${{ inputs.chicken-egg-providers }}
constraints-branch: ${{ inputs.constraints-branch }}
docker-cache: ${{ inputs.docker-cache }}
disable-airflow-repo-cache: ${{ inputs.disable-airflow-repo-cache }}
6 changes: 6 additions & 0 deletions .github/workflows/push-image-cache.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,10 @@ on: # yamllint disable-line rule:truthy
description: "Docker cache specification to build the image (registry, local, disabled)."
required: true
type: string
disable-airflow-repo-cache:
description: "Disable airflow repo cache read from main."
required: true
type: string
jobs:
push-ci-image-cache:
name: "Push CI ${{ inputs.cache-type }}:${{ matrix.python }} image cache "
Expand All @@ -100,6 +104,7 @@ jobs:
DEFAULT_BRANCH: ${{ inputs.branch }}
DEFAULT_CONSTRAINTS_BRANCH: ${{ inputs.constraints-branch }}
DOCKER_CACHE: ${{ inputs.docker-cache }}
DISABLE_AIRFLOW_REPO_CACHE: ${{ inputs.disable-airflow-repo-cache }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_USERNAME: ${{ github.actor }}
Expand Down Expand Up @@ -162,6 +167,7 @@ jobs:
DEFAULT_BRANCH: ${{ inputs.branch }}
DEFAULT_CONSTRAINTS_BRANCH: ${{ inputs.constraints-branch }}
DOCKER_CACHE: ${{ inputs.docker-cache }}
DISABLE_AIRFLOW_REPO_CACHE: ${{ inputs.disable-airflow-repo-cache }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_USERNAME: ${{ github.actor }}
Expand Down
5 changes: 4 additions & 1 deletion Dockerfile.ci
Original file line number Diff line number Diff line change
Expand Up @@ -1180,7 +1180,10 @@ SHELL ["/bin/bash", "-o", "pipefail", "-o", "errexit", "-o", "nounset", "-o", "n
ARG PYTHON_BASE_IMAGE
ARG AIRFLOW_IMAGE_REPOSITORY="https://github.com/apache/airflow"

# By increasing this number we can do force build of all dependencies
# By increasing this number we can do force build of all dependencies.
# NOTE! When you want to make sure dependencies are installed from scratch in your PR after removing
# some dependencies, you also need to set "disable image cache" in your PR to make sure the image is
# not built using the "main" version of those dependencies.
ARG DEPENDENCIES_EPOCH_NUMBER="11"

# Make sure noninteractive debian install is used and language variables set
Expand Down
67 changes: 23 additions & 44 deletions dev/breeze/doc/ci/07_debugging.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,11 @@
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)*

- [Debugging CI Jobs in Github Actions](#debugging-ci-jobs-in-github-actions)
- [Debugging CI Jobs in Github Actions and changing their behaviour](#debugging-ci-jobs-in-github-actions-and-changing-their-behaviour)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

# Debugging CI Jobs in Github Actions
# Debugging CI Jobs in Github Actions and changing their behaviour

The CI jobs are notoriously difficult to test, because you can only
really see results of it when you run them in CI environment, and the
Expand All @@ -39,49 +39,28 @@ difficulty is that `Build Images` workflow is `pull-request-target`
type, which means that it will always run using the `main` version - no
matter what is in your Pull Request.

There are several ways how you can debug the CI jobs when you are
maintainer.
There are several ways how you can debug the CI jobs and modify their
behaviour when you are maintainer.

When you create the PR you can set one of the labels below, also
in some cases, you need to run the PR as coming from the "apache"
repository rather than from your fork.

You can also apply the label later and rebase the PR or close/reopen
the PR to apply the label to the PR.

| Action to perform | Label to set | PR from "apache" repo |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------|:---------------------:|
| Run the build with all combinations of all<br>python, backends, kubernetes etc on PR, <br>and run all types of tests for all test<br>groups. | full tests needed | |
| Force to use public runners for the build | use public runners | |
| Debug resources used during the build for <br>parallel jobs | debug ci resources | |
| Force running PR on latest versions of<br>python, backends, kubernetes etc. when you<br>want to save resources and test only latest<br>versions | latest versions only | |
| Force running PR on minimal (default) <br>versions of python, backends, kubernetes etc.<br>in order to save resources and run tests only<br>for minimum versions | default versions only | |
| Make sure to clean dependency cache<br>usually when removing dependencies<br>You also need to increase<br> `DEPENDENCIES_EPOCH_NUMBER` in `Dockerfile.ci` | disable image cache | |
| Change build images workflows, breeze code or<br>scripts that are used during image build<br>so that the scripts can be modified by PR<br> | | Yes |
| Treat your build as "canary" build - including<br>updating constraints and pushing "main"<br>documentation. | | Yes |
| Remove any behaviour specific for the committers<br>such as using different runners by default. | non committer build | |

- When you want to tests the build with all combinations of all python,
backends etc on regular PR, add `full tests needed` label to the PR.
- When you want to test maintainer PR using public runners, add
`public runners` label to the PR
- When you want to see resources used by the run, add
`debug ci resources` label to the PR
- When you want to test changes to breeze that include changes to how
images are build you should push your PR to `apache` repository not to
your fork. This will run the images as part of the `CI` workflow
rather than using `Build images` workflow and use the same breeze
version for building image and testing
- When you want to test changes to workflows and CI scripts you can set
`all versions` label to the PR or `latest versions only`.
This will make the PR run using "all" versions of
Python, Kubernetes and the DBS. By default - unless you also change
dependencies in `pyproject.toml` or `generated/provider_dependencies.json`
such PRs will only use "default" versions of Python, Kubernetes and
DBs. This is useful when you want to test changes to the CI scripts
are not affected by the versions of Python, Kubernetes and DBs.
- Even if you change dependencies in `pyproject.toml`, or
`generated/provider_dependencies.json`, when you want to test changes to workflows
and CI scripts you can set `default versions only` label to the
This will make the PR run using the default (or latest) versions of
Python and Kubernetes and DBs. This is useful when you want to test
changes to the CI scripts and workflows and you want to use far
less resources than the full tests.
- When you want to test changes to `build-images.yml` workflow you
should push your branch as `main` branch in your local fork. This will
run changed `build-images.yml` workflow as it will be in `main` branch
of your fork
- When you are a committer and you change build images workflow, together
with build scripts, your build might fail because your scripts are used
in `build-images.yml` workflow, but the workflow is run using the `main`
version. Setting `non committer build` label will make your PR run using
the main version of the scripts and the workflow
- When you are a committer want to test how changes in your workflow affect
`canary` run, as maintainer, you should push your PR to `apache` repository
not to your fork and set `canary` label to the PR
- When you are a committer and want to test if the tests are passing if the
image is freshly built without cache, you can set `disable image cache` label.

-----

Expand Down
Loading

0 comments on commit 51f4302

Please sign in to comment.