You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi - I discussed this a bit with @phofl , as I'm aiming to have zero-cost-abstraction around Dask DataFrame in Narwhals
One thing I'd like to check is how to check that two Dask Series have the same index. Or, rather, that concatenating them would not result in any index alignment
Patrick pointed me to dask_expr._expr.are_co_aligned, which seemed to work great for me until I tried using __getitem__. Here's an example:
In [21]: df=dd.from_pandas(pd.DataFrame({'a': [1,2,3], 'b': [4,5,6]}))
In [22]: dask_expr._expr.are_co_aligned(df._expr, df['a'][[1,2,0]]._expr)
Out[22]: True
This isn't quite what I was expecting - if I compute df['a'][[1,2,0]], then the index has indeed been shuffled with respect to df
Is this a bug in are_co_aligned? If not, is there another way to check that index alignment does not happen?
Thanks 🙏
The text was updated successfully, but these errors were encountered:
are_co_aligned only check with respect to the partitions being properly aligned, not the actual values within the partition.
That said, most methods that change the index values are also returning false for are co aligned.
I don’t have a good solution for this of the top of my head unfortunately, let me think about this a little. are_co_aligned is a bit weaker than what you are looking for
Hi - I discussed this a bit with @phofl , as I'm aiming to have zero-cost-abstraction around Dask DataFrame in Narwhals
One thing I'd like to check is how to check that two Dask Series have the same index. Or, rather, that concatenating them would not result in any index alignment
Patrick pointed me to
dask_expr._expr.are_co_aligned
, which seemed to work great for me until I tried using__getitem__
. Here's an example:This isn't quite what I was expecting - if I compute
df['a'][[1,2,0]]
, then the index has indeed been shuffled with respect todf
Is this a bug in
are_co_aligned
? If not, is there another way to check that index alignment does not happen?Thanks 🙏
The text was updated successfully, but these errors were encountered: