Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect missing data handling with pandas v1+ #152

Open
aputkov opened this issue Aug 3, 2021 · 0 comments
Open

Incorrect missing data handling with pandas v1+ #152

aputkov opened this issue Aug 3, 2021 · 0 comments

Comments

@aputkov
Copy link

aputkov commented Aug 3, 2021

Pandas v1+ introduced a new class (pd.NA) for missing data. ffn library v0.3.6 utilizes numpy isnan function to identify missing data. The problem is that np.isnan(pd.NA) returns pd.NA, which is ambiguous in boolean expressions. The effect of this issue is an exception raised when calc_stats functions computes the return table from monthly returns. Monthly returns necessarily is missing the value in the 1st row of DataFrame or Series. This value is passed to np.isnan function on line 320 in _calculate function of PerformanceStats class in core.py. With pandas v1+ the missing value passed to np.isnan is pd.NA. The expression np.isnan(mr[idx]) on line 320 of core.py is evaluated to pd.NA, and 'if np.isnan(mr[idx])' raises an exception.
To make ffn library work with pandas v1+ np.isnan function on line 320 of core.py needs to be replaced with pd.isna.
In addition and for the same reasons the expression 'if np.isnan(number)' on line 115 in utils.py needs to be replaced with 'if (np.isnan(number) | pd.isna(number))'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant