Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent broadcasting behavior of Py arrays #392

Closed
LilithHafner opened this issue Oct 23, 2023 · 11 comments
Closed

Inconsistent broadcasting behavior of Py arrays #392

LilithHafner opened this issue Oct 23, 2023 · 11 comments
Labels
bug Something isn't working enhancement New feature or request
Milestone

Comments

@LilithHafner
Copy link
Contributor

Affects: Both (easier to reproduce with PythonCall, but I actually hit it using JuliaCall)

Describe the bug
x .* y where x is a Py wrapping a numpy array and y is a Matrix gives strange (buggy) results. Converting to a PyArray is a workaround:

julia> pyexec("import numpy as np", Main)

julia> x = pyeval("np.array([[1,2],[3,4]])", Main)
Python:
array([[1, 2],
       [3, 4]])

julia> y = rand(2,2);

julia> x .* y # BUG
2×2 Matrix{Py}:
 array([0.58564244, 1.17128489])  array([0.33111401, 0.66222801])
 array([2.77225103, 3.69633471])  array([0.94132148, 1.25509531])

julia> pyconvert(Any, x) .* y
2×2 Matrix{Float64}:
 0.585642  0.662228
 2.77225   1.2551
@LilithHafner LilithHafner added the bug Something isn't working label Oct 23, 2023
@cjdoris
Copy link
Collaborator

cjdoris commented Oct 24, 2023

I don't think this is a bug.

Your x is a Py, which is not a AbstractArray, and is not intended to have any array-like semantics beyond anything which generic Python types give you (such as length and getindex).

If you want to broadcast it like an array, then you need to explicitly convert it to an array like you have done (or pyconvert(AbstractArray, x) if you want to be stricter).

@cjdoris
Copy link
Collaborator

cjdoris commented Oct 24, 2023

However the fact that you get 2x2 matrix of length-2 vectors back from x .* y is presumably because iterating a numpy matrix yields its rows, so Julia sees it as a length-2 iterator, interprets that as a size (2,1) object, and broadcasts along the second dimension.

This isn't a bug, it's just that broadcasting behaviour is as-yet undefined by PythonCall. We could define that behaviour - namely always treat Py as a singleton when broadcasting.

@cjdoris
Copy link
Collaborator

cjdoris commented Oct 24, 2023

Note also

julia> Py(1) .* rand(2,2)
ERROR: Python: TypeError: 'int' object is not iterable

so indeed Julia's broadcasting mechanism is expecting Py to be iterable for broadcasting.

@cjdoris
Copy link
Collaborator

cjdoris commented Oct 24, 2023

Indeed, it's simple to make Py always broadcast as a singleton:

julia> Base.Broadcast.broadcastable(x::Py) = Ref(x)

julia> Py(1) .* rand(2,2)
2×2 Matrix{Py}:
 0.10215112007150162  0.26483168933920176
 0.35676829614369443  0.9558859914622307

julia> x = np.array([1 2; 3 4])
Python:
array([[1, 2],
       [3, 4]], dtype=int64)

julia> x .* rand(2,2)
2×2 Matrix{Py}:
 array([[0.86518142, 1.73036285],
       [2.59554427, 3.46072569]])  array([[0.9674522 , 1.9349044 ],
       [2.90235659, 3.86980879]])
 array([[0.97068333, 1.94136666],
       [2.91204999, 3.88273333]])  array([[0.09704627, 0.19409255],
       [0.29113882, 0.38818509]])

@cjdoris cjdoris added the enhancement New feature or request label Oct 24, 2023
@LilithHafner
Copy link
Contributor Author

Is it viable to have a semantic where Python scalars broadcast as scalars and Numpy arrays broadcast as multidimensional arrays?

@cjdoris
Copy link
Collaborator

cjdoris commented Oct 27, 2023

You could, by trying to pyconvert(AbstractArray, x) then broadcast on that if it works. But this is the sort of magic that PythonCall tries to avoid - I aim for a simple interface that allows for generic programming. Having special cases for arrays goes against that a bit.

If you want to treat your numpy array as a Julia array, it's much clearer and explicit to do the pyconvert yourself.

@LilithHafner
Copy link
Contributor Author

Agreed, attempted pyconvert(AbstractArray, x) does not count as viable. Could we check for the presence of a size field of a python object and then broadcast with shape based on that size field if present and as a scalar if absent (seems somewhat reasonable because x.size is used across multiple python libraries) or is that also too magical for your design philosophy?

@cjdoris
Copy link
Collaborator

cjdoris commented Oct 27, 2023

That also seems magical - size could be present on all sorts of non-array types with totally different semantics.

@LilithHafner
Copy link
Contributor Author

Yeah, okay. :(

So the best we can do without magic is make them all scalars, right?

@cjdoris
Copy link
Collaborator

cjdoris commented Oct 28, 2023

I think so.

@cjdoris cjdoris added this to the features milestone Nov 4, 2023
@cjdoris
Copy link
Collaborator

cjdoris commented Nov 4, 2023

Scalar broadcasting implemented on 'main'.

@cjdoris cjdoris closed this as completed Nov 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants