-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow compatible mapped values in __setitem__ #7
Allow compatible mapped values in __setitem__ #7
Conversation
|
||
@staticmethod | ||
def try_from(obj: Any, *, axis_zero: int = 0) -> ScippDataArrayAdapter | None: | ||
try: | ||
import scipp | ||
|
||
if isinstance(obj, scipp.Variable): | ||
return ScippDataArrayAdapter(scipp.DataArray(obj)) | ||
return ScippDataArrayAdapter(scipp.DataArray(obj), scipp=scipp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit awkward that we have to pass the module as an arg everywhere...
Was it to avoid importing scipp
in multiple places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, making the code cleaner.
{ | ||
'a': sc.array(dims=['x'], values=[1, 2, 3]), | ||
'b': sc.array(dims=['x'], values=[11, 12, 13]), | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm a little confused because the name above is a ScippDataArrayAdapter
, so I thought it would be a sc.DataArray
, like the pandas DataFrame
above.
But I guess you just need a mapping of key to ArrayLike? Does this mean that you never use the .data
in the DataArray
?
I assume you need a structure that can be sliced/indexed. Could it be a DataGroup
instead of a DataArray
internally?
I realize this is besides the point of the current PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ScippDataArrayAdapter
also handles scipp.Variable
, interpreting the latter as an data array without coords.
I assume you need a structure that can be sliced/indexed. Could it be a DataGroup instead of a DataArray internally?
Using a dict
currently, could in principle add support for Dataset
and DataGroup
. But not "instead of": The DataArray
holds the values for a single node, the dict
(or DataGroup
, ...) maps to multiple nodes, just like the columns of pandas.DataFrame
vs. its rows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, I see now
|
||
graph = cb.Graph(g) | ||
mapped = graph.map(node_values).reduce('c', name='d') | ||
mapped['x'] = mapped['d'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, the 'x'
here is unrelated to the 'x'
dimension in
{
'a': sc.array(dims=['x'], values=[1, 2, 3]),
'b': sc.array(dims=['x'], values=[11, 12, 13]),
}
above.
If so, can we use a different name here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was asking because at first it confused me, as I thought they were related.
g.add_edge('a', 'b') | ||
graph = cb.Graph(g) | ||
mapped1 = graph.map({'a': [1, 2]}).reduce('b', name='d') | ||
mapped2 = graph.map({'a': sc.array(dims=('x',), values=[1, 2])}).reduce( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand why this is different from the above? In mapped1
you have [1, 2]
and in mapped2
a Scipp Variable
. Isn't it the same as with the numpy array, that the types are different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it actually raises for different reasons, this test is to ensure that both code paths work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my question was why is it raising for a different reason, I would have expected to raise with the same reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is an artifact of the usual problem of having two checks in a particular order, so depending on the exact properties of the input you get one exception or the other.
If you still want my review, can you explain what this PR is supposed to do? This is hard to tell from the diff alone. |
It allows compatible mapped values in |
@pytest.mark.parametrize( | ||
'node_values', | ||
[ | ||
{'a': [1, 2, 3], 'b': [11, 12, 13]}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that {'a': [1, 2, 3]}
is sufficient to make the test fail with the code on main. So there seems to be no need to have multiple mapped nodes to test this behaviour. Can you add a test that only maps one node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Co-authored-by: Jan-Lukas Wynen <[email protected]>
This was not implemented initially for simplicity, but downstream use showed that is it essential.