Skip to content
This repository has been archived by the owner on Mar 17, 2023. It is now read-only.

Speedy #18

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Speedy #18

wants to merge 7 commits into from

Conversation

atarashansky
Copy link

Hello!

I've upgraded the xicor implementation to be much faster. The results are identical compared to the original implementation.

A couple notes:

  1. I removed all the properties. Properties are not implicitly cached and so every time you reference a property, it is recalculated. As a result, you are unecessarily re-ranking the data many, many times.

  2. I am using numpy.unique with return_inverse and return_counts flags to quickly calculate f and g, which means I don't need to explicitly rank any of the data.

  3. The ties argument in the p-value function was inverted. I fixed the if statement.

Let me know what you think.

Comment on lines +20 to +23
_,b,c = np.unique(self.y[self.x_ordered],return_counts=True,return_inverse=True)
self.f = np.cumsum(c)[b]
_,b,c = np.unique(-self.y[self.x_ordered],return_counts=True,return_inverse=True)
self.g = np.cumsum(c)[b]
Copy link

@PetterS PetterS Dec 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite nice but I think it is possible to make it even faster with a single call to unique. E.g. something like this:

def _ranks(x):
    _, inverse, counts = np.unique(
        x, return_index=False, return_inverse=True, return_counts=True
    )
    r = np.cumsum(counts)[inverse]
    # Compute l the same way but flip the counts first.
    counts = np.flip(counts)
    inverse = len(counts) - inverse - 1
    l = np.cumsum(counts)[inverse]
    return r, l

It is what I did when I tried computing this correlation coefficient.

@zbarry
Copy link

zbarry commented Dec 30, 2021

Hey @atarashansky - cached_property https://docs.python.org/3/library/functools.html#functools.cached_property might be useful for taking care of the multiple calls to properties.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants