Skip to content

Commit

Permalink
fix: problem of dcor, cannot process >20K rows of dataframe
Browse files Browse the repository at this point in the history
  • Loading branch information
anhkhoangoho committed Jan 10, 2024
1 parent aeaebc8 commit 77458ce
Showing 1 changed file with 5 additions and 0 deletions.
5 changes: 5 additions & 0 deletions qolmat/benchmark/metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -1117,6 +1117,11 @@ def distance_anticorr(df1: pd.DataFrame, df2: pd.DataFrame, df_mask: pd.DataFram

df1 = df1[df_mask.any(axis=1)]
df2 = df2[df_mask.any(axis=1)]

if len(df1) > 30000:
df1 = df1.sample(20000)
df2 = df2.sample(20000)

return (1 - dcor.distance_correlation(df1.values, df2.values)) / 2


Expand Down

0 comments on commit 77458ce

Please sign in to comment.