Add unit test to capture-stop some streams #883

amishatishpatel · 2024-11-08T10:48:59Z

Randomly select data streams in the XB-engine to issue a ?capture-stop request to at
some point during data processing. Verify that those streams stopped only have data
up to the capture-stop point, and other streams have processed and transmitted data
uninterrupted.

Checklist (if not applicable, edit to add (N/A) and mark as done):

(n/a) If dependencies are added/removed: update setup.cfg and .pre-commit-config.yaml
(n/a) If modules are added/removed: use sphinx-apidoc -efo doc/ src/ to update files in doc/
Ensure copyright notices are present and up-to-date
(n/a) If qualification tests are changed: attach a sample qualification report
(n/a) If design has changed: ensure documentation is up to date
(n/a) If ICD-defined sensors have been added: update fake_servers.py in katsdpcontroller to match

Closes NGC-1155.

Also tweaked the _send_data helper and how it returned X- and B-engine results for incorporation of this test. Contributes to: NGC-1155.

And refine the test itself. Also fix verify_beam_sensors after the previous _send_data update. Contributes to: NGC-1155.

Turns out it doesn't need to be around at all for the newest xbgpu unit test. Contributes to: NGC-1155.

I imported it to use in the end-to-end test, figured I'd use it in the new test as well. Contributes to: NGC-1155.

amishatishpatel · 2024-11-08T10:50:45Z

test/xbgpu/test_engine.py

+            except spead2.Stopped:
+                pass


I'm not overjoyed with my deadpan handling of the Ringbuffer Stopped exception. I'm going to defer to the expert on any suggestions. I'll be sure to push something if/when I think of it.

I'm also not wild about it. My main worry is that if something causes a Stopped exception in one of the other tests, you're going to have zeros in the results and you'll get a data mismatch, which will be misleading and make the failure harder to debug.

My inclination would be to pass extra information into send_data to indicate when it should expect to run out of data. It can then directly verify that this is actually when data stops flowing: after receiving all the expected data, it could try to receive another heap inside a pytest.raises context manager (which might still need to handle descriptors).

It might also be nice to then size the results array to the amount of data actually expected, instead of returning arrays with trailing zeros. That's actually what I thought you were going to do when I saw you'd replaced a big array with a dictionary of arrays. I realise this could have knock-on implications for all the verification functions though, so if it's going to be big pain then don't bother.

I realise this could have knock-on implications for all the verification functions though, so if it's going to be big pain then don't bother.

Your proposed suggestion is exactly what I intended on doing. Stopped short of that because I didn't want my changes to be too wholesale. Still, I see it's tending that way anyway even with just a "expect data to stop here" parameter. Might as well go all in.

james-smith-za · 2024-11-08T14:52:29Z

test/xbgpu/test_engine.py

+        # lists of two different types could be added together.
+        all_stream_names = list(map(lambda output: output.name, chain(corrprod_outputs, beam_outputs)))
+        rng = np.random.default_rng(seed=1)
+        # Select random subset of output data products to issue `?capture-stop` to


The last part of the comment seems to have gotten lost somewhere...

Looks fine to me, if a little ungrammatical.

bmerry · 2024-11-12T13:28:23Z

test/xbgpu/test_engine.py

@@ -446,7 +446,7 @@ def verify_corrprod_sensors(

 def verify_beam_data(
    beam_outputs: list[BOutput],
-    beam_results: np.ndarray,
+    beam_results: list[np.ndarray],


Need to update the docstring to match.

bmerry · 2024-11-12T13:30:05Z

test/xbgpu/test_engine.py

-            List of arrays of all GPU-generated data. One output array per
-            corrprod_output, where each array has shape
+            Dictionary of arrays of all XPipeline output. Each key is the
+            corrprod_output name, each value is an array with shape


Comma splice

bmerry · 2024-11-12T13:30:38Z

test/xbgpu/test_engine.py

-            Beamformer output, with shape (n_beams, n_batches,
-            n_channels_per_substream, n_spectra_per_heap, COMPLEX).
+            Dictionary of arrays of all BPipeline output. Each key is the
+            beam_output name, each value is an array with shape


Comma splice

bmerry · 2024-11-12T13:42:39Z

test/xbgpu/test_engine.py

+        # Naive list concatention using '+' not used below as mypy was not happy
+        # lists of two different types could be added together.
+        all_stream_names = list(map(lambda output: output.name, chain(corrprod_outputs, beam_outputs)))


Really? Mypy seems happy with the following for me:

Suggested change

# Naive list concatention using '+' not used below as mypy was not happy

# lists of two different types could be added together.

all_stream_names = list(map(lambda output: output.name, chain(corrprod_outputs, beam_outputs)))

# Naive list concatention using '+' not used below as mypy was not happy

# lists of two different types could be added together.

all_stream_names = [output.name for output in corrprod_outputs + beam_outputs]

bmerry · 2024-11-12T13:42:50Z

test/xbgpu/test_engine.py

+        # lists of two different types could be added together.
+        all_stream_names = list(map(lambda output: output.name, chain(corrprod_outputs, beam_outputs)))
+        rng = np.random.default_rng(seed=1)
+        # Select random subset of output data products to issue `?capture-stop` to


Looks fine to me, if a little ungrammatical.

bmerry · 2024-11-13T09:20:28Z

test/xbgpu/test_engine.py

+                # the accumulation index where data stopped transmitting.
+                np.testing.assert_equal(
+                    corrprod_results[stopped_stream][
+                        capture_stop_chunk_id * HEAPS_PER_FENGINE_PER_CHUNK // heap_accumulation_threshold[0] :


This line wrapping makes this whole statement hard to read. I'd suggest pulling the calculation into a separate statement (assigning it to a variable).

bmerry · 2024-11-13T09:25:35Z

test/xbgpu/test_engine.py

+
+        for captured_stream in all_stream_names:
+            if captured_stream in corrprod_results.keys():
+                # NOTE: The `heap_factory` only sends real data. The results buffer has


Um, it looks to me like it's sending 10+10j. Perhaps you mean that the correlations that result are purely real?

Yes, apologies, I'm referring to the results. I noticed that when inspecting the corrprod_results ndarray.

bmerry · 2024-11-13T09:33:16Z

test/xbgpu/test_engine.py

-                    heap = await stream.get()
+                items = ig_recv.update(heap)
+                assert (
+                    len(list(items.values())) == 0


I realise you haven't actually changed this code, just reflowed it, but I think this could just be

Suggested change

len(list(items.values())) == 0

len(items) == 0

That would probably allow it to go back to being a single line instead of awkwardly wrapped.

bmerry · 2024-11-13T09:35:11Z

test/xbgpu/test_engine.py

@@ -1201,7 +1212,7 @@ def heap_factory(batch_index: int, present: np.ndarray) -> list[spead2.send.Heap

        expected_beam_saturated_low, expected_beam_saturated_high = verify_beam_data(
            beam_outputs=beam_outputs,
-            beam_results=beam_results,
+            beam_results=list(beam_results.values()),


This makes me slightly nervous: it depends on the iteration order of the beam_results dictionary matching the order of beam_outputs. That is guaranteed with the way you've written the code, but it's an implicit assumption that someone might break then spend time debugging. Some alternative ideas:

Suggested change

beam_results=list(beam_results.values()),

beam_results=[beam_results[output] for output in beam_outputs],

Change the parameter in verify_beam_data to a dictionary.

Add a comment where beam_results is defined to indicate that the iteration order is significant.

Some comment applies to the verify_corrprod_data call.

Noted all around and appreciate you flagging this. Must admit, I had similar concerns, but hadn't considered this particular case.

Regarding

Change the parameter in verify_beam_data to a dictionary

I'm not quite seeing why that needs to be done if you've suggested list comprehension for the beam_results values?

That is, [my_dict[this_key] for this_key in my_keys] just results in a list of values (at least in how I've tested the logic in an ipython session)

EDIT: I know your suggestion is shorthand, but I have to add

beam_outputs is type list[BOutput], so that would be [beam_results[output.name] for output in ...]

If that affects your suggestion at all

I meant do one of those things, not all of them.

bmerry · 2024-11-13T09:46:48Z

test/xbgpu/test_engine.py

+            except spead2.Stopped:
+                pass


I'm also not wild about it. My main worry is that if something causes a Stopped exception in one of the other tests, you're going to have zeros in the results and you'll get a data mismatch, which will be misleading and make the failure harder to debug.

My inclination would be to pass extra information into send_data to indicate when it should expect to run out of data. It can then directly verify that this is actually when data stops flowing: after receiving all the expected data, it could try to receive another heap inside a pytest.raises context manager (which might still need to handle descriptors).

It might also be nice to then size the results array to the amount of data actually expected, instead of returning arrays with trailing zeros. That's actually what I thought you were going to do when I saw you'd replaced a big array with a dictionary of arrays. I realise this could have knock-on implications for all the verification functions though, so if it's going to be big pain then don't bother.

amishatishpatel added 4 commits November 8, 2024 09:45

Add capture-stop unit test for some streams

e0cd770

Also tweaked the _send_data helper and how it returned X- and B-engine results for incorporation of this test. Contributes to: NGC-1155.

Complete test_capture_stop_some_streams

deec5db

And refine the test itself. Also fix verify_beam_sensors after the previous _send_data update. Contributes to: NGC-1155.

Remove unnecessary xbengine fixture from unit test

2ac82e0

Turns out it doesn't need to be around at all for the newest xbgpu unit test. Contributes to: NGC-1155.

Use chain to combine lists in unit test

45b1d91

I imported it to use in the end-to-end test, figured I'd use it in the new test as well. Contributes to: NGC-1155.

amishatishpatel requested a review from bmerry November 8, 2024 10:48

amishatishpatel commented Nov 8, 2024

View reviewed changes

james-smith-za reviewed Nov 8, 2024

View reviewed changes

bmerry requested changes Nov 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add unit test to capture-stop some streams #883

Add unit test to capture-stop some streams #883

amishatishpatel commented Nov 8, 2024

amishatishpatel Nov 8, 2024

bmerry Nov 13, 2024

amishatishpatel Nov 14, 2024

james-smith-za Nov 8, 2024

bmerry Nov 12, 2024

bmerry Nov 12, 2024

bmerry Nov 12, 2024

bmerry Nov 12, 2024

bmerry Nov 12, 2024

bmerry Nov 12, 2024

bmerry Nov 13, 2024

bmerry Nov 13, 2024

amishatishpatel Nov 13, 2024

bmerry Nov 13, 2024

bmerry Nov 13, 2024

amishatishpatel Nov 13, 2024 •

edited

Loading

bmerry Nov 13, 2024

bmerry Nov 13, 2024

	beam_results=list(beam_results.values()),
	beam_results=[beam_results[output] for output in beam_outputs],

Add unit test to capture-stop some streams #883

Are you sure you want to change the base?

Add unit test to capture-stop some streams #883

Conversation

amishatishpatel commented Nov 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amishatishpatel Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amishatishpatel Nov 13, 2024 •

edited

Loading