-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling of XGM data recording the wrong number of pulses #161
base: master
Are you sure you want to change the base?
Conversation
src/extra/components/xgm.py
Outdated
for information on retrieving the true number of pulses. | ||
Warning: | ||
This can be unreliable, see the docs for | ||
[XGM.npulses()][extra.components.XGM.npulses] for information on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For someone in a hurry to get something done, and maybe not a native English speaker, I think it would be easy to read this as 'this method is bad, npulses()
is better'.
I think it might be worth repeating the pointer to XrayPulses
in each of these warnings, even if that's a bit more verbose, rather than pointing people to the docs of another method to find that information.
if not counts_match: | ||
warn(f"Slow data pulse counts ({key}) don't match the counts from fast data (data.intensityTD), data may be invalid!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we know the slow data counts aren't always reliable and the difference from the fast data is a useful signal of that, does that imply the fast data are more reliable? Should we be using that instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in my experience the fast data is more reliable and in some ways more useful than what we get from the pulses components because it will show the real number of pulses that are getting delivered. I don't know why but sometimes the number of pulses will change for... reasons... and for whatever reason that doesn't seem to be reflected in the bunch pattern table.
As for whether we should be using that instead of the slow data, I'm not sure. I am a little hesitant to make it do something other than the obvious 'just return whatever was saved', but as you say that's probably not very useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will show the real number of pulses that are getting delivered. I don't know why but sometimes the number of pulses will change for... reasons... and for whatever reason that doesn't seem to be reflected in the bunch pattern table.
Hu, do you have a concrete example? That should not be possible, as they both are looking at exactly the same data - the XGM DOOCS server takes the bunch pattern to slice out the correct values from the train signal for data.intensityTD
. That's the entire reason I don't trust it, it's an interpretation of the same signal through a 3rd party I don't control.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
We could have actually offered a PulsePattern
object based on XGM data, this would have automatically given all the different methods like constant pattern, pulse counts etc.
if not np.allclose(pulse_counts[0], pulse_counts): | ||
raise ValueError("Number of pulses is changing, there is no nominal number.") | ||
|
||
self._npulses[pg] = int(pulse_counts[0]) | ||
|
||
return self._npulses[pg] | ||
|
||
def pulse_counts(self, sase=None): | ||
"""Return a 1D [DataArray][xarray.DataArray] of the number of pulses in each train. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry I didn't notice that earlier, but it is unfortunate we're using xarray.DataArray
here and pd.Series
for PulsePattern
-derived components. We should avoid mixing these types arbitrarily between components in the future.
Out of curiosity, any particular reason or just preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason, it's just a preference since I tend to use DataArrays/Datasets a lot.
In particular, by: - Explicitly checking for a wrong number of pulses by comparing fast/slow data and emitting a warning if they differ. - Add more obvious warnings to the docs.
b20a92b
to
53d15d4
Compare
I think I'm happy with the state of this now, in 53d15d4 I changed the |
In particular, by:
Follow up to #153.