Allow compatible mapped values in setitem #7

SimonHeybrock · 2024-06-06T05:45:32Z

This was not implemented initially for simplicity, but downstream use showed that is it essential.

nvaytet · 2024-06-06T08:22:02Z

src/cyclebane/node_values.py


    @staticmethod
    def try_from(obj: Any, *, axis_zero: int = 0) -> ScippDataArrayAdapter | None:
        try:
            import scipp

            if isinstance(obj, scipp.Variable):
-                return ScippDataArrayAdapter(scipp.DataArray(obj))
+                return ScippDataArrayAdapter(scipp.DataArray(obj), scipp=scipp)


A bit awkward that we have to pass the module as an arg everywhere...
Was it to avoid importing scipp in multiple places?

Yes, making the code cleaner.

nvaytet · 2024-06-06T08:44:35Z

tests/graph_test.py

+        {
+            'a': sc.array(dims=['x'], values=[1, 2, 3]),
+            'b': sc.array(dims=['x'], values=[11, 12, 13]),
+        },


I think I'm a little confused because the name above is a ScippDataArrayAdapter, so I thought it would be a sc.DataArray, like the pandas DataFrame above.
But I guess you just need a mapping of key to ArrayLike? Does this mean that you never use the .data in the DataArray?
I assume you need a structure that can be sliced/indexed. Could it be a DataGroup instead of a DataArray internally?

I realize this is besides the point of the current PR...

ScippDataArrayAdapter also handles scipp.Variable, interpreting the latter as an data array without coords.

I assume you need a structure that can be sliced/indexed. Could it be a DataGroup instead of a DataArray internally?

Using a dictcurrently, could in principle add support for Dataset and DataGroup. But not "instead of": The DataArray holds the values for a single node, the dict (or DataGroup, ...) maps to multiple nodes, just like the columns of pandas.DataFrame vs. its rows.

Ah ok, I see now

nvaytet · 2024-06-06T08:49:58Z

tests/graph_test.py

+
+    graph = cb.Graph(g)
+    mapped = graph.map(node_values).reduce('c', name='d')
+    mapped['x'] = mapped['d']


If I understand correctly, the 'x' here is unrelated to the 'x' dimension in

{ 'a': sc.array(dims=['x'], values=[1, 2, 3]), 'b': sc.array(dims=['x'], values=[11, 12, 13]), }

above.

If so, can we use a different name here?

I was asking because at first it confused me, as I thought they were related.

nvaytet · 2024-06-06T08:53:06Z

tests/graph_test.py

+    g.add_edge('a', 'b')
+    graph = cb.Graph(g)
+    mapped1 = graph.map({'a': [1, 2]}).reduce('b', name='d')
+    mapped2 = graph.map({'a': sc.array(dims=('x',), values=[1, 2])}).reduce(


Not sure I understand why this is different from the above? In mapped1 you have [1, 2] and in mapped2 a Scipp Variable. Isn't it the same as with the numpy array, that the types are different?

Yes, but it actually raises for different reasons, this test is to ensure that both code paths work.

I guess my question was why is it raising for a different reason, I would have expected to raise with the same reason.

It is an artifact of the usual problem of having two checks in a particular order, so depending on the exact properties of the input you get one exception or the other.

jl-wynen · 2024-06-07T08:42:56Z

If you still want my review, can you explain what this PR is supposed to do? This is hard to tell from the diff alone.

SimonHeybrock · 2024-06-10T03:30:11Z

If you still want my review, can you explain what this PR is supposed to do? This is hard to tell from the diff alone.

It allows compatible mapped values in __setitem__. Previously it was impossible to use this it the graph being set, originated, e.g., from a subgraph of the same graph, and map was applied beforehand. This shows up in practice, e.g., when mapping a reduction workflow graph over a filename, with subsequent reduce calls in several places, to be recombined into a single graph.

jl-wynen · 2024-06-10T09:57:42Z

tests/graph_test.py

+@pytest.mark.parametrize(
+    'node_values',
+    [
+        {'a': [1, 2, 3], 'b': [11, 12, 13]},


It seems that {'a': [1, 2, 3]} is sufficient to make the test fail with the code on main. So there seems to be no need to have multiple mapped nodes to test this behaviour. Can you add a test that only maps one node?

src/cyclebane/node_values.py

Co-authored-by: Jan-Lukas Wynen <[email protected]>

Allow compatible values in __setitem__

c1ca804

SimonHeybrock requested a review from jl-wynen June 6, 2024 05:45

nvaytet reviewed Jun 6, 2024

View reviewed changes

nvaytet self-assigned this Jun 6, 2024

nvaytet approved these changes Jun 6, 2024

View reviewed changes

Base automatically changed from fix-ancestor-removal-logic to main June 10, 2024 03:18

jl-wynen reviewed Jun 10, 2024

View reviewed changes

SimonHeybrock and others added 2 commits June 10, 2024 12:36

Apply suggestions from code review

7326a63

Co-authored-by: Jan-Lukas Wynen <[email protected]>

Add more test cases

ff35dfe

jl-wynen approved these changes Jun 10, 2024

View reviewed changes

SimonHeybrock merged commit 3b67e2d into main Jun 10, 2024
3 checks passed

SimonHeybrock deleted the allow-compat-indicies-and-values-insetitem branch June 10, 2024 12:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow compatible mapped values in setitem #7

Allow compatible mapped values in setitem #7

SimonHeybrock commented Jun 6, 2024

nvaytet Jun 6, 2024

SimonHeybrock Jun 6, 2024

nvaytet Jun 6, 2024

SimonHeybrock Jun 6, 2024

nvaytet Jun 6, 2024

nvaytet Jun 6, 2024

nvaytet Jun 6, 2024

nvaytet Jun 6, 2024

SimonHeybrock Jun 6, 2024

nvaytet Jun 6, 2024

SimonHeybrock Jun 6, 2024

jl-wynen commented Jun 7, 2024

SimonHeybrock commented Jun 10, 2024

jl-wynen Jun 10, 2024

SimonHeybrock Jun 10, 2024

Allow compatible mapped values in __setitem__ #7

Allow compatible mapped values in __setitem__ #7

Conversation

SimonHeybrock commented Jun 6, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jl-wynen commented Jun 7, 2024

SimonHeybrock commented Jun 10, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Allow compatible mapped values in setitem #7

Allow compatible mapped values in setitem #7