Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: parametrize over Array API modules #298

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ev-br
Copy link
Contributor

@ev-br ev-br commented Nov 14, 2024

towards #235

@ev-br ev-br force-pushed the api_modules_on_ci branch 2 times, most recently from f742ae2 to 429b32a Compare November 14, 2024 09:38
@ev-br ev-br changed the title CI: parametrize over the Array API modules CI: parametrize over Array API modules Nov 14, 2024
@ev-br ev-br force-pushed the api_modules_on_ci branch 2 times, most recently from 21bed6b to cf262c8 Compare November 14, 2024 10:02
Keep testing with array-api-strict, add numpy, torch and jax.numpy
@ev-br
Copy link
Contributor Author

ev-br commented Nov 14, 2024

This seems to be conceptually wrong. Now, what do we actually want to be testing here:

  • a pedantic implementation of the spec? then we only need to run array-api-strict, as before
  • whether specific array libraries implement the standard? They don't, as failures show, and is this not what array-api-compat is for
  • whether array-api-compat wrappers are compatible with the spec? then we need to add -compat as a dependency.

I'm sure this was discussed before, so am going to stop for now :-).

EDIT: or maybe something like this can be useful for reporting, if we ever want to autogenerate compatibility reports for bare array api namespaces.

@asmeurer
Copy link
Member

The thing we really want to test here is that the tests themselves don't break whenever we update them or write new ones. This is especially important if we want to do some more serious refactorings here. The test suite is a little odd because we often will "break" upstream by adding a test that fails, because the upstream library is not conformant but the thing was just never tested before. But we don't want to "break" upstream by adding a test that's actually wrong, or testing something in a way that isn't really actually required by the standard (an example would be a floating-point closeness test that is too strict).

So when it comes down to it, what should go on CI here really just comes down to developer convenience. The test suite will be run upstream eventually anyways, by the upstreams themselves (and compat is probably the primary upstream since it tests 4 different libraries). On the one hand, testing more libraries here will catch more things. But on the other hand, we have to maintain a set of skips/xfails here to effectively do that, because many failures are expected. This is the case even for compat. We could reuse the xfail files in the compat repo, although that would effectively amount to just duplicating the CI here from the compat repo. The benefit of that would be catching failures faster.

But also consider that this repo doesn't have releases. If something breaks, you can very easily revert it.

The strict library is nice because it has almost no xfails. Plan NumPy 2.1 is like that as well, although that could change whenever new functions get added to the standard.

For now, I've made heavy use of local testing against various libraries when doing development here. If you feel that having that on CI would be better, then you should set it up. It really comes down to what's the most convenient. That would mean either maintaining a set of xfails, or just ignoring failures entirely and just using the CI as a vehicle to run the tests. You definitely should check any updated test against the major libraries one way or the other.

@asmeurer
Copy link
Member

Regarding reporting, there was a lot of work on this previously (see reporting.py), and there was much discussion about whether the tests should be run here or in the upstream projects. The idea at the time was to run the tests in the upstream projects and have them upload reports.

That work will probably start up again at some point. For what it's worth, I feel now that the approach of running the reporting tests in the upstream repos was getting too complicated, and so I would now suggest that just running the reporting tests here or in some other data-apis controlled repo would be a much better approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants