Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strided vector functions #2210

Open
fredrik-johansson opened this issue Jan 29, 2025 · 1 comment
Open

Strided vector functions #2210

fredrik-johansson opened this issue Jan 29, 2025 · 1 comment

Comments

@fredrik-johansson
Copy link
Collaborator

fredrik-johansson commented Jan 29, 2025

It would be useful to have strided version of a subset of _gr_vec functions, allowing efficient application to columns and diagonals of matrices, reversed vectors, odd-even subvectors, etc. For example one could have an assignment function like this:

_gr_vec_set_strided(res, x, xstride, y, ystride, len, ctx)

Potentially, this could replace a number of specialized vector functions, such as those mixing vectors and scalars. For example, instead of

_gr_vec_add_scalar(res, x, len, y, ctx)

one could do

_gr_vec_add_strided(res, x, 1, y, 0, len, ctx)

Likewise for _gr_vec_dot / _gr_vec_dot_rev. Note that a _gr_vec_dot_strided could be used for basecase matrix multiplication without the need for a temporary transpose (though the temporary transpose is better for cache efficiency with large matrices anyway). Likewise for other accumulating operations like sum, max. Currently arb_dot, acb_dot and _fmpz_vec_dot_general have this functionality.

For non-accumulating (entrywise) operations, we mostly care about this for machine types and some near-machine types like fmpz where the overhead of calling elementwise operations in a loop can be noticeable.

Unfortunately, strided functions may not be able to replace non-strided ones entirely for such types: there are more parameters to pass around making function calls slightly slower, and some loops will run slower with a runtime offset instead of a fixed compile-time offset like 1 or -1. One can have branches for special strides, but then the branches also have an O(1) penalty.

@fredrik-johansson
Copy link
Collaborator Author

Related: _nmod_vec_dot_ptr is basically obsolete after #2162 and could be supplanted by a strided version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant