Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf improvement for interp: set assume_sorted automatically #9758

Open
dcherian opened this issue Nov 9, 2024 · 0 comments
Open

perf improvement for interp: set assume_sorted automatically #9758

dcherian opened this issue Nov 9, 2024 · 0 comments

Comments

@dcherian
Copy link
Contributor

dcherian commented Nov 9, 2024

What is your issue?

assume_sorted is False, so for vectorized interpolation across multiple dimensions, we end up lexsorting the coordinates all the time. For some reason, this can be quite slow with dask.

obj = self if assume_sorted else self.sortby(list(coords))

Instead we should be able to do

obj = self
# sort by slicing if we can
for coord in set(indexers) and set(self._indexes):
    # TODO: better check for PandasIndex
    if self.indexes[coord].is_monotonic_decreasing:
        obj = obj.isel(coord: slice(None, None, -1))

# TODO: make None the new default
if assume_sorted is None:
    # TODO: dims without coordinates are fine too
    assume_sorted = all(self.indexes[coord].is_monotonic_increasing for coord in indexers)

I'll add a reproducible example later, but the problem I've been playing gets much faster for graph construction:
image

xref #6799

cc @mpiannucci @Illviljan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant