Improve handling of XGM data recording the wrong number of pulses #161

JamesWrigley · 2024-04-16T13:05:19Z

In particular, by:

Explicitly checking for a wrong number of pulses by comparing fast/slow data and emitting a warning if they differ.
Add more obvious warnings to the docs.

Follow up to #153.

takluyver · 2024-04-18T15:03:36Z

src/extra/components/xgm.py

-        for information on retrieving the true number of pulses.
+        Warning:
+            This can be unreliable, see the docs for
+            [XGM.npulses()][extra.components.XGM.npulses] for information on


For someone in a hurry to get something done, and maybe not a native English speaker, I think it would be easy to read this as 'this method is bad, npulses() is better'.

I think it might be worth repeating the pointer to XrayPulses in each of these warnings, even if that's a bit more verbose, rather than pointing people to the docs of another method to find that information.

takluyver · 2024-04-18T15:04:55Z

src/extra/components/xgm.py

+            if not counts_match:
+                warn(f"Slow data pulse counts ({key}) don't match the counts from fast data (data.intensityTD), data may be invalid!")


If we know the slow data counts aren't always reliable and the difference from the fast data is a useful signal of that, does that imply the fast data are more reliable? Should we be using that instead?

Yes, in my experience the fast data is more reliable and in some ways more useful than what we get from the pulses components because it will show the real number of pulses that are getting delivered. I don't know why but sometimes the number of pulses will change for... reasons... and for whatever reason that doesn't seem to be reflected in the bunch pattern table.

As for whether we should be using that instead of the slow data, I'm not sure. I am a little hesitant to make it do something other than the obvious 'just return whatever was saved', but as you say that's probably not very useful.

it will show the real number of pulses that are getting delivered. I don't know why but sometimes the number of pulses will change for... reasons... and for whatever reason that doesn't seem to be reflected in the bunch pattern table.

Hu, do you have a concrete example? That should not be possible, as they both are looking at exactly the same data - the XGM DOOCS server takes the bunch pattern to slice out the correct values from the train signal for data.intensityTD. That's the entire reason I don't trust it, it's an interpretation of the same signal through a 3rd party I don't control.

Ok, so weirdly enough the XrayPulses component does report the right number but not PumpProbePulses 🐙 Example from p6156, r185 with the XGM:

And the pulses components:

My code was using PumpProbePulses so it was giving the wrong number of pulses. Is that a bug in PumpProbePulses?

No, it's doing exactly what it is supposed to do:

PumpProbePulses creates a unified pattern from both, i.e. it looks at every possible pulse the machine could have, and counts it as a filled pulse if either FEL, PPL or both are present.

Ah ok, makes sense.

philsmt

LGTM.

We could have actually offered a PulsePattern object based on XGM data, this would have automatically given all the different methods like constant pattern, pulse counts etc.

philsmt · 2024-04-25T07:23:43Z

src/extra/components/xgm.py

+            if not np.allclose(pulse_counts[0], pulse_counts):
+                raise ValueError("Number of pulses is changing, there is no nominal number.")
+
+            self._npulses[pg] = int(pulse_counts[0])

        return self._npulses[pg]

    def pulse_counts(self, sase=None):
        """Return a 1D [DataArray][xarray.DataArray] of the number of pulses in each train.


I'm sorry I didn't notice that earlier, but it is unfortunate we're using xarray.DataArray here and pd.Series for PulsePattern-derived components. We should avoid mixing these types arbitrarily between components in the future.

Out of curiosity, any particular reason or just preference?

No reason, it's just a preference since I tend to use DataArrays/Datasets a lot.

In particular, by: - Explicitly checking for a wrong number of pulses by comparing fast/slow data and emitting a warning if they differ. - Add more obvious warnings to the docs.

JamesWrigley · 2024-04-29T13:59:25Z

I think I'm happy with the state of this now, in 53d15d4 I changed the pulse_counts() to return the fast data counts by default if a mismatch was detected. I'll leave this open for another round of review, but feel free to rebase and merge it if everyone agrees.

JamesWrigley added the bug Something isn't working label Apr 16, 2024

JamesWrigley requested review from takluyver and philsmt April 16, 2024 13:05

JamesWrigley self-assigned this Apr 16, 2024

takluyver reviewed Apr 18, 2024

View reviewed changes

philsmt approved these changes Apr 25, 2024

View reviewed changes

JamesWrigley added 3 commits April 29, 2024 15:14

Improve handling of XGM data recording the wrong number of pulses

0a58000

In particular, by: - Explicitly checking for a wrong number of pulses by comparing fast/slow data and emitting a warning if they differ. - Add more obvious warnings to the docs.

fixup! Improve handling of XGM data recording the wrong number of pulses

260adcd

fixup! Improve handling of XGM data recording the wrong number of pulses

53d15d4

JamesWrigley force-pushed the xgm-pulses branch from b20a92b to 53d15d4 Compare April 29, 2024 13:53

fixup! Improve handling of XGM data recording the wrong number of pulses

abc5cff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve handling of XGM data recording the wrong number of pulses #161

Improve handling of XGM data recording the wrong number of pulses #161

JamesWrigley commented Apr 16, 2024

takluyver Apr 18, 2024

takluyver Apr 18, 2024

JamesWrigley Apr 18, 2024

philsmt Apr 18, 2024

JamesWrigley Apr 18, 2024

philsmt Apr 19, 2024 •

edited

Loading

JamesWrigley Apr 19, 2024

philsmt left a comment

philsmt Apr 25, 2024

JamesWrigley Apr 29, 2024

JamesWrigley commented Apr 29, 2024

		if not counts_match:
		warn(f"Slow data pulse counts ({key}) don't match the counts from fast data (data.intensityTD), data may be invalid!")

Improve handling of XGM data recording the wrong number of pulses #161

Are you sure you want to change the base?

Improve handling of XGM data recording the wrong number of pulses #161

Conversation

JamesWrigley commented Apr 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philsmt Apr 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philsmt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JamesWrigley commented Apr 29, 2024

philsmt Apr 19, 2024 •

edited

Loading