Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to NumPy v2 and bump versions for several dependent packages #4925

Merged
merged 115 commits into from
Dec 19, 2024

Conversation

agriyakhetarpal
Copy link
Member

@agriyakhetarpal agriyakhetarpal commented Jul 10, 2024

Description

This PR bumps the NumPy version in-tree to version 2.0.0, released on June 16, 2024. A lot of packages here are expected to fail with their builds because there has been an ABI change in NumPy, and many packages are being tracked in numpy/numpy#26191 – let us hope there won't be too many of them here. I would suggest that this goes into v0.27 rather than 0.26.2, not that there is a lot of time in 0.26.2 anyway... I shall try my best to update as many failing packages as I can in this PR, or open separate PRs for each of them if that would be better. I should be able to get an estimate once the CI runs. Versions of packages that are compatible with v2.0.0rc1 should be ABI-compatible with NumPy v2.

Checklists

@agriyakhetarpal
Copy link
Member Author

First failure is with building NumPy itself

FileNotFoundError: [Errno 2] No such file or directory:                         
'/tmp/tmpn7pa0ha3/numpy-2.0.0/numpy/core/include/numpy/numpyconfig.h'

which is just because it's included in _core/ and not in core/, and similarly _numpyconfig.h will be created there

@agriyakhetarpal
Copy link
Member Author

The new error shows that the static libs are placed in newer paths as well.

From https://github.com/numpy/numpy/blob/2176571e1c7e5040df7a3cdfd9f7d859f4feb014/numpy/_core/npymath.ini.in#L18 I see that it moved to _core/ as well, but libnpyrandom is at the same location: https://github.com/numpy/numpy/blob/2176571e1c7e5040df7a3cdfd9f7d859f4feb014/numpy/random/meson.build#L16

@agriyakhetarpal
Copy link
Member Author

I got NumPy to build with not a lot of effort, which is great. Next, I'm going to the NumPy dependents. SciPy v1.13.0 is the first version to support NumPy v2 (#4719), which I need to return to after #4920 gets merged. I'll explore the rest of the dependents right now.

@ryanking13
Copy link
Member

I got NumPy to build with not a lot of effort, which is great. Next, I'm going to the NumPy dependents. SciPy v1.13.0 is the first version to support NumPy v2 (#4719), which I need to return to after #4920 gets merged. I'll explore the rest of the dependents right now.

Cooool! That's awesome. I agree that this should go to 0.27. While we try our best to update packages as much as possible to support Numpy 2.0, I think it is also an option to disable some of them temporarily if they do not support Numpy 2.0.

@agriyakhetarpal
Copy link
Member Author

agriyakhetarpal commented Jul 11, 2024

Cooool! That's awesome. I agree that this should go to 0.27. While we try our best to update packages as much as possible to support Numpy 2.0, I think it is also an option to disable some of them temporarily if they do not support Numpy 2.0.

@ryanking13, thanks for your input, I can look into disabling packages once I have further progress. Would you know of a way to reliably run the Selenium code in test_numpy.py inside the provided Docker container? I seem to run into errors with pyodide_dist_dir everytime even when I spin up a fresh container. Running locally could help me debug faster.

@ryanking13
Copy link
Member

Could you share with me the error you get? Some of the package versions in our docker image (e.g. pytest) is outdated so it may cause some error, but I am not sure.

@agriyakhetarpal
Copy link
Member Author

Ah, I see it's a very standard error:

$ python packages/numpy/test_numpy.py

which gets me

Traceback (most recent call last):
  File "/src/packages/numpy/test_numpy.py", line 363, in <module>
    @run_in_pyodide(packages=["numpy"])
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/.docker_home/.local/lib/python3.12/site-packages/pytest_pyodide/decorator.py", line 408, in __init__
    pytest_assert_rewrites and package_is_built("pytest")
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/.docker_home/.local/lib/python3.12/site-packages/pytest_pyodide/decorator.py", line 22, in package_is_built
    return _package_is_built(package_name, pytest.pyodide_dist_dir)
                                           ^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'pytest' has no attribute 'pyodide_dist_dir'

I upgraded pytest and pytest-pyodide but didn't get anywhere with that, I'll spend some time debugging the JS interface error I just received

@agriyakhetarpal
Copy link
Member Author

NumPy test suite passing (not sure if this is the best way to handle the buffer), but I'll move on to the rest now.

@agriyakhetarpal
Copy link
Member Author

scikit-learn v1.4.2 was the first one to include support for NumPy v2.0.0, but we already have v1.5.0 in the main branch. Should I bump it to the latest release here, i.e., v1.5.1?

@agriyakhetarpal
Copy link
Member Author

agriyakhetarpal commented Jul 12, 2024

Here, I have updated OpenCV (opencv-python) and yt. As mentioned above, scikit-learn is already at a compatible version, so it seems that bumping SciPy to v1.13.0 (#4719) is the only issue holding up this PR, which is waiting on #4920. I assume that the versions of the rest of the packages we have right now will have been built against NumPy v2, but if they haven't been, I'll take care of that.

@ryanking13
Copy link
Member

python packages/numpy/test_numpy.py

I think you need to call pytest not python.

pytest packages/numpy/test_numpy.py

FYI: I generally prefer verbose mode + chrome browser when testing locally.

pytest -v --rt=chrome packages/numpy/test_numpy.py

@ryanking13
Copy link
Member

scikit-learn v1.4.2 was the first one to include support for NumPy v2.0.0, but we already have v1.5.0 in the main branch. Should I bump it to the latest release here, i.e., v1.5.1?

Updating the package version is not required, but updates are always welcome. If you're having trouble updating, you can leave it for later.

it seems that bumping SciPy to v1.13.0 (#4719) is the only issue holding up this PR

Awesome! Thank you for updating these packages!

@agriyakhetarpal
Copy link
Member Author

Updated scikit-learn to version 1.5.1 in 4ebc115 – tagging the recipe and package maintainer @lesteve for visibility

@agriyakhetarpal agriyakhetarpal marked this pull request as ready for review December 18, 2024 13:38
@agriyakhetarpal
Copy link
Member Author

Looks like we are good to merge now. Surfacing @ryanking13's suggestion from above:

I would suggest 1) disable pyarrow in this PR and merge, 2) build pyarrow with latest xbuildenv (we publish it to http://pyodide-cache.s3-website-us-east-1.amazonaws.com/xbuildenv/dev/xbuildenv.tar.bz2). 3) update pyarrow in a separate PR.

I'll be doing this in follow-up PRs for PyArrow and LightGBM, and we don't need to wait for them for this PR. Thanks!

@agriyakhetarpal
Copy link
Member Author

I confirmed that all tests for all packages are passing by temporarily removing the --skip-passed flag we have in pytest_wrapper.py so that we run them. I've reverted that commit now. If there are any reports of packages that don't correctly, we always have the provision for a 0.27.1 release.

@agriyakhetarpal agriyakhetarpal changed the title NumPy v2 Update to NumPy v2 and bump versions for several dependents Dec 18, 2024
.gitattributes Outdated Show resolved Hide resolved
Makefile.envs Outdated Show resolved Hide resolved
Copy link
Member

@hoodmane hoodmane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much @agriyakhetarpal for all your hard work on this!

@agriyakhetarpal
Copy link
Member Author

Actually, PyArrow requires a small patch to build it with NumPy 2.0, which I am working on upstreaming.

@agriyakhetarpal
Copy link
Member Author

agriyakhetarpal commented Dec 19, 2024

I started https://github.com/pyodide/pyodide-numpy-2.0-rebuilds as a repository with my own account to build LightGBM and PyArrow separately, and upon transfer to the Pyodide organisation, GitHub decided to revoke my access to the repository's settings completely for some reason, just before I was going to finalise and release the built wheels (it's much better for the community for the wheels to point to a Pyodide repository rather than mine). We can archive the repository a bit later when we have a permalink to release artifacts.

We also have the option of using the "dev" xbuildenv, as @ryanking13 had mentioned in #4925 (comment). The only reason why I didn't pursue this was because it, too, requires the CI job on the main branch to pass fully, and LightGBM is currently failing there again – so we cannot rely on it and can't keep re-triggering it until it does pass.

@hoodmane, could you please make me an administrator for that repository (or grant the "core" team administrator access to repositories, whichever is simpler) so that I can complete the rest of the work? I will never understand teams on GitHub 😅

@ryanking13
Copy link
Member

@agriyakhetarpal I gave you admin role for pyodide-numpy-2.0-rebuilds repository!

@agriyakhetarpal
Copy link
Member Author

@agriyakhetarpal I gave you admin role for pyodide-numpy-2.0-rebuilds repository!

Thanks! I went ahead with it, and a release is now up: https://github.com/pyodide/pyodide-numpy-2.0-rebuilds/releases/tag/v1.0.0. Attestations to verify the build provenance of the wheels are also included. I guess we can go ahead with merging this PR because all tests are passing, and focus on adding these two wheels in follow-ups (#5265 and #5266).

@agriyakhetarpal agriyakhetarpal changed the title Update to NumPy v2 and bump versions for several dependents Update to NumPy v2 and bump versions for several dependent packages Dec 19, 2024
@agriyakhetarpal agriyakhetarpal merged commit e6d4bae into pyodide:main Dec 19, 2024
39 of 40 checks passed
@agriyakhetarpal
Copy link
Member Author

Thank you, all, for those involved with the code review and guidance – and especially the recipe maintainers, for their relentless assistance with package updates.

@agriyakhetarpal agriyakhetarpal deleted the update/numpy-v2 branch December 19, 2024 13:56
@agriyakhetarpal agriyakhetarpal mentioned this pull request Dec 19, 2024
3 tasks
kou pushed a commit to apache/arrow that referenced this pull request Dec 20, 2024
…pten, and update Pyodide-related documentation (#45072)

### Rationale for this change

This change would allow building PyArrow correctly with NumPy 1.X and NumPy 2.X, since we are trying to do the latter for pyodide/pyodide#4925. This PR closes gh-45071.

### What changes are included in this PR?

This PR
- issues a correction for the NumPy header files when building under Emscripten
- updates Pyodide-related build instructions

### Are these changes tested?

Yes, working here: https://github.com/agriyakhetarpal/pyodide-numpy-2.0-rebuilds/actions/runs/12399351376/job/34619554658#step:8:4547 via agriyakhetarpal/pyodide@b651698 that applies a subset of the changes as a patch (the CI job is failing for unrelated reasons, please ignore).

### Are there any user-facing changes?

Yes, users trying to build a WASM wheel via Pyodide are now requested to use newer Pyodide and Emscripten versions, and the latest stable version of `pyodide-build` available.

* GitHub Issue: #45071

Authored-by: Agriya Khetarpal <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
@hoodmane hoodmane mentioned this pull request Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants