Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data] [1/n] Fix dependency related Ray Data tests for Python 3.12 #46545

Closed
wants to merge 13 commits into from

Conversation

scottjlee
Copy link
Contributor

@scottjlee scottjlee commented Jul 10, 2024

Why are these changes needed?

Fixes the following tests:

  • python/ray/data:test_mars

    • Fix: mark as manual, since mars library doesn't support Python 3.11+
  • linux://python/ray/data:test_formats

    • Fix: pin tensorflow-datasets==4.9.6
  • linux://python/ray/data:test_execution_optimizer

  • linux://doc/source/data/examples:stablediffusion_batch_prediction

    • Fix was attempted by pinning torch version installed in notebook from torch<2 to torch==2.3.0, but this results in an OOM/timeout on the CI docs test for an unknown reason.
  • linux://python/ray/data:test_tf

    • test_tf.py::TestToTF::test_element_spec_shape_with_ragged_tensors
    • test_tf.py::TestToTF::test_training: Still broken due to Keras 2 -> 3 upgrade breaking changes, forced in the TF 2.16.1 upgrade. We will need to file an issue with the Keras team to further investigate and resolve.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Scott Lee <[email protected]>
@scottjlee scottjlee added the go add ONLY when ready to merge, run all tests label Jul 10, 2024
Copy link
Collaborator

@can-anyscale can-anyscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixing a lot of tests w00h00

# 4.9.6 fixes the following error with Python 3.12:
# `RecursionError: maximum recursion depth exceeded`
# See: https://github.com/tensorflow/datasets/issues/4666#issuecomment-2149200103
tensorflow-datasets==4.9.6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w00t, you would need to update here too https://github.com/ray-project/ray/blob/ray-py312/python/requirements_compiled.txt#L1894 otherwise pip will complain during installation

@can-anyscale
Copy link
Collaborator

@scottjlee
Copy link
Contributor Author

might need to update https://github.com/ray-project/ray/blob/ray-py312/python/requirements_compiled.txt#L1894 as well

thanks, i was hoping to use the output generated from build: pip-compile dependencies. or is it acceptable to just directly modify the corresponding line in requirements_compiled.txt? i was thinking that this may potentially impact other dependencies, so wanted to be sure.

@can-anyscale
Copy link
Collaborator

ah got you, you're doing it right, forget what i was saying

scottjlee and others added 6 commits July 10, 2024 16:00
Signed-off-by: Scott Lee <[email protected]>
## Why are these changes needed?
To upgrade to py312 we need to upgrade to `pandas>=2.0.0`. This upgrade
introduced a breaking change in the syntax of some of our code / tests:
> Changed behavior in setting values with df.loc[:, foo] = bar or
df.iloc[:, foo] = bar, these now always attempt to set values inplace
before falling back to casting ([GH
45333](pandas-dev/pandas#45333))

As a result we needed to update these uses of loc to do direct
assignment.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: Matthew Owen <[email protected]>
Signed-off-by: Scott Lee <[email protected]>
@@ -14,15 +14,15 @@ absl-py==1.4.0
# tensorflow-metadata
# tensorflow-probability
accelerate==0.28.0
# via -r /home/ubuntu/ray/ci/../python/requirements/ml/core-requirements.txt
# via -r /ray/ci/../python/requirements/ml/core-requirements.txt
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i used the output from here, but looks like it updated all these paths to/ray/ci? is this the correct way @can-anyscale ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi yes this is ok; the /home/ubuntu was because i generated this file from my local machine and not using the one from CI

Copy link
Collaborator

@can-anyscale can-anyscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@omatthew98 omatthew98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm from the data side

@scottjlee
Copy link
Contributor Author

Closing in favor of #46730

@scottjlee scottjlee closed this Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants