Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from opendatahub-io:main #502

Open
wants to merge 69 commits into
base: main
Choose a base branch
from

Conversation

pull[bot]
Copy link

@pull pull bot commented Jan 30, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

jiridanek and others added 30 commits January 8, 2025 17:42
…the test, which can be run with ci testing
[Digest Updater Action] Update Notebook Images
…notebooks

Our testing framework had a fundamental issue in that it would, for certain images, naively run the test suite of parent images for a child image.  In the case of `tensorflow` images, this caused problems because the (child) `tensorflow` image would **purposefully** _downgraded_ a package version given compatability issues. Furthermore, when attempting to verify the changes to address this core issue, numerous other bugs were encountered that required to be fixed.

Significant changes introduced in this commit:
- logic of the `test-%` Makefile target was moved into a shell script named `test_jupyter_with_papermill.sh`
- test resources required to be copied to pod are now retrieved via the locally checked out repo
    - previously there were pulled from a remote branch via `wget` (defaulting to `main` branch)
    - this ensure our PR checks are now always leveraging any updated files
- test_notebook.ipynb files now expect an `expected_versions.json` file to exist within the same directory.
    - `expected_versions.json` file is derived from the relevant `...-notebook-imagestream.yaml` manifest and leveraged when asserting on dependency versions
		- admittedly the duplication of various helper functions across all the notebook files is annoying - but helps to keep the `.ipynb` files self-contained
- `...-notebook-imagestream.yaml` manifest had annotations updated to include any package that had a unit test in `test_notebook.ipynb` asserting versions
- CUDA tensorflow unit test that converts to ONNX was updated to be actually functional

Minor changes introduced in this commit:
- use `ifdef OS` in Makefile to avoid warnings about undefined variable
- all `test_notebook.ipynb` files:
    - have an `id` attribute defined in metadata
		- specify the `nbformat` as `4.5`
- the more compute-intensive notebooks had `readinessProbe` and `livenessProbe` settings updated to be less aggressive
    - was observing liveness checks sporadically failing while the notebook was being tested - and this update seemed to help
- `trustyai` notebook now runs the minimal and datascience papermill tests (similar to `tensorflow` and `pytorch`) instead of including the test code within its own `test_noteook.ipynb` file
- various "quality of life" improvements where introduced into `test_jupyter_with_papermill.sh`
    - properly invoke tests for any valid/supported target
        - previously certain test targets required manual intervention in order to run end to end
    - improved logging (particularly when running with the `-x` flag)
		- more modular structure to hopefully improve readability
		- script now honors proper shell return code semantics (i.e. returns `0` on success)

It should also be noted that while most `intel` notebooks are now passing the papermill tests - there are issues with the `intel` `tensorflow` unit tests still.  More detail is captured in the JIRA ticket related to this commit.  However, we are imminently removing `intel` logic from the `notebooks` repo entirely... so I didn't wanna burn any more time trying to get that last notebook to pass as it will be removed shortly!

Further details on `expected_versions.json`:
- `yq` (which is installed via the `Makefile` is used to:
    - query the relevant imagestream manifest to parse out the `["opendatahub.io/notebook-software"]` and `["opendatahub.io/notebook-python-dependencies"]` annotations from the first element of the `spec.tags` attribute
		- inject name/version elements for `nbdime` and `nbgitpuller` (which are asserted against in `minimal` notebook tests)
		- convert this `yaml` list to a JSON object of the form: `{<package>: <version>}`
- this JSON object is then copied into the running notebook workload in the same directly that `test_notebook.ipynb` resides
- each `test_notebook.ipynb` has a couple helper functions defined to then interact with this file:
    - `def load_expected_versions() -> dict:`
		- `def get_expected_version(dependency_name: str) -> str:
- The argument provided to the `get_expected_version` function should match the `name` attribute of the JSON structure defined in the imagestream manifest

Related-to: https://issues.redhat.com/browse/RHOAIENG-16587
RHOAIENG-16587: fix(test): ensure papermill tests run successfully for all supported notebooks
RHOAIENG-17695: chore(ci): create a test for calling `oc version` in the test, which can be run with ci testing
…n't have too many debug logs there from passing tests

This way full logs are only printed for failing tests, see https://docs.pytest.org/en/stable/how-to/logging.html
RHOAIENG-17695: chore(ci): capture logs in pytest tests so that we don't have too many debug logs there from passing tests
…ted on OCP-CI so broken intel notebooks should be deleted (#843)

* RHOAIENG-8388: rm(intel): Intel tensorflow notebook failed to get tested on OCP-CI

The Intel workbench images are broken in multiple ways and since we don't ship them, let's not even carry them on the books.

* fixup, remove intel references from test_jupyter_with_papermill.sh
RHOAIENG-17305: chore(tests): IPv4 compatibility test(s), checking that we did not break the single-stack IPv4 case
…at's now causing the script to fail

```
./scripts/test_jupyter_with_papermill.sh: line 345: jupyter_ml_notebook_id: unbound variable
```
RHOAIENG-8388: rm(intel): fixup to remove one forgotten intel case that's now causing the script to fail
RHOAIENG-17257: chore(tests): add allure dependency so that we can have fancy html report
This is useful in some tests when it's not necessary to actually start the IDE to test something.
RHOAIENG-17257: chore(tests): make the wait for readiness optional
RHOAIENG-17257: chore(tests): externalize the default workbench-starting arguments
…596842

[Codeflare Action] Update notebook's pipfile to sync with Codeflare-SDK release 0.26.0
…c-updater-12886596842

Update manifests after odh-sync-updater-12886596842 CodeFlare GitHub action ran
@pull pull bot added ⤵️ pull merge-conflict Resolve conflicts manually labels Jan 30, 2025
@openshift-ci openshift-ci bot requested review from atheo89 and jiridanek January 30, 2025 14:57
Copy link

openshift-ci bot commented Jan 30, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign jiridanek for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot
Copy link
Collaborator

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

openshift-ci bot commented Jan 30, 2025

Hi @pull[bot]. Thanks for your PR.

I'm waiting for a red-hat-data-services member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

RHOAIENG-9707: chore(tests/containers): check shared objects with ldd
@openshift-merge-robot
Copy link
Collaborator

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

jiridanek and others added 2 commits January 31, 2025 11:00
RHOAIENG-9707: chore(tests/containers): try to install the cowsay package to check the python venv is writable
@jiridanek
Copy link
Member

jiridanek commented Jan 31, 2025

@atheo89 hi, this sync bot does not work well because of these conflicts.

I tried to run the sync github action, but that failed too as it encountered conflicts it can't resolve

so I created a sync pr by hand here

@atheo89
Copy link
Member

atheo89 commented Jan 31, 2025

I tried to run the sync github action, but that failed too as it encountered conflicts it can't resolve

Yeah, because these GHA work on resolving the known conflicts, Now it found unexpected conflicts and it broke

Unresolved conflicts detected in the following files:
100644 90fed6cbc77c0da38a9ce625183577a4bff189e2 1	.github/workflows/build-notebooks.yaml
100644 053db9adcb3e11909be2a96465ddafc33337dfa5 2	.github/workflows/build-notebooks.yaml
100644 565a483577d997bb87711bb3a9e2f64b45a300ee 3	.github/workflows/build-notebooks.yaml
100644 88d4fcd770929bfb0d9c74c721662858f899be74 1	poetry.lock
100644 13776e3e034ee76399e36dbd0765dea2f176867d 2	poetry.lock
100644 a4924bf16b024534222d6aa[81](https://github.com/red-hat-data-services/notebooks/actions/runs/13072161052/job/36475973945#step:6:82)934de7e548688dd 3	poetry.lock

Also, I removed the Pull app from the repo so it will be quite from now on..

jiridanek and others added 10 commits January 31, 2025 20:09
This adds a new check for the content of the workbench startup logs
searching for some keywords that may indicate some problem or issue.
The TrustyAI package has wrong version of transformers package in the
ImageStream manifest file compared to what is actually installed on the
image.

This fixes https://issues.redhat.com/browse/RHOAIENG-19388.
…less podman machine

```
base_image_test.py:146: in test_oc_command_runs_fake_fips
    assert ecode == 0, output.decode()
E   AssertionError: assertion failed [!result.is_error]: Unable to open /proc/sys/vm/mmap_min_addr
E     (VMAllocationTracker.cpp:317 init)
E
E   assert 137 == 0
```

```
lima cat /proc/sys/vm/mmap_min_addr
65536
```

```
podman machine ssh cat /proc/sys/vm/mmap_min_addr
65536
```

```
podman run --entrypoint /bin/bash --rm -it ghcr.io/jiridanek/notebooks/workbench-images:base-ubi9-python-3.11-jd_ubi_base_1e8dd3140d980ff573d56d3ae746959f31825d8a
WARNING: image platform (linux/amd64) does not match the expected platform (linux/arm64)
bash-5.1$ cat /proc/sys/vm/mmap_min_addr
65536
```
RHOAIENG-18979: chore(test/containers): check workbench startup logs
RHOAIENG-19388: fix the transformers version in manifests
NO-ISSUE: chore(tests/containers): fix fake fips tests for macOS rootless podman machine
* RHOAIENG-18930: update Jupyterlab package to 4.2.7

Main motivation for this upgrade is the fix of the [1] in the 4.2.7 [2]
release. This should fix the [3] then and the Extension manager can work
as expected again now.

* [1] jupyterlab/jupyterlab#17113
* [2] https://github.com/jupyterlab/jupyterlab/releases/tag/v4.2.7
* [3] https://issues.redhat.com/browse/RHOAIENG-18930

* Update Pipfile.lock files by piplock-renewal.yaml action

---------

Co-authored-by: GitHub Actions <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants