-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qblox driver fixes for executing on iqm5q #1143
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## qblox #1143 +/- ##
==========================================
- Coverage 46.04% 45.82% -0.23%
==========================================
Files 88 88
Lines 4131 4151 +20
==========================================
Hits 1902 1902
- Misses 2229 2249 +20
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@alecandido just to let you know of the current status: the error I am currently getting is RuntimeError: connection command 'in0' (index 0): connection already exists when trying to connect to the sequencers used for acquisition. I am not sure how the connection already exists, since If I bypass this error using try-except, then the programs are submitted but it hangs forever (or at least for longer than expected so I cancel) in
The program sent
looks okay to me though. |
If it's multiplexing, it is a bit weird, because I would have expected no actual issue in connecting a different sequencer to the same port. About the hanging, you may try to print which is the input of the Actually, it may be multiplexing generating the error, but this also reminds me that I may be wasting some sequencers (and we may not have enough), since we are treating probe and acquisition as different channels, while in the Qblox arrangement they may be controlled by the same processor (with one input and one output channel). In this case, the only solution I see is to deny probe channels in the Qblox driver, taking the settings from them to tune the joint probe-acquisition sequencer, but then filtering them out in the sequence generation, configuration, and execution (i.e. we keep them only at the platform level, and we use them accessing from the related acquisition channel). |
This was the first I checked, however the input seems fine (~2.5sec in my case). Also, it seems that it is reaching
|
5bd2a58
to
14cc331
Compare
@stavros11 I should be done with most of the sequence generation refactor. I'll keep working on the base branch, but I do not expect to touch much or at all the file this PR is working on (i.e. |
d25e1c0
to
14cc331
Compare
P.S.: apparently, I lied, I had to modify |
14cc331
to
1bf5bfd
Compare
I rebased once more, and now there are no longer probe channels around, reducing the number of sequencers used. Moreover, in 78983fc, I actually fixed a bug in which a variable was shadowed, and the call querying the status of the acquisition (and the subsequent ones attempting to retrieve it) were actually sending requests to a random object. @stavros11 whenever you rerun the code for your previous attempt, let me know if it still blocks as before. |
1bf5bfd
to
47af393
Compare
Thanks for the updates @alecandido. I retried the same example and the status is the same as before. In particular I still get RuntimeError: connection command 'in0' (index 0): connection already exists from
Even though I can see that there are no sequencers for probe anymore, it still complains that the acquisition ones are already connected. Also, if I bypass this error, the execution still hangs forever in
|
4a6e29b
to
b7c6915
Compare
) | ||
return f"{direction}{channels}" | ||
return f"out{channels}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not very happy about this change, however it was the easiest way to fix the
RuntimeError: connection command 'in0' (index 0): connection already exists
error. If you also don't have any idea why this happens, maybe it is worth asking qblox if it is a bug with in
addresses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked, but there seems to be nothing wrong in in0
, since that is used in
seq.connect_sequencer(address.local_address) |
and the docs of that method tells the following:
*connections
(str) – Zero or more connections to make, each specified using a string. The string should have the format or _. must be in to make a connection between an input and the acquisition path, out to make a connection from the waveform generator to an output, or io to do both. The channels must be integer channel indices. If only one channel is specified, the sequencer operates in real mode; if two channels are specified, it operates in complex mode.
https://docs.qblox.com/en/main/api_reference/sequencer.html#:~:text=Sequencer.connect_sequencer
@@ -39,6 +39,7 @@ def integration_lenghts( | |||
(mod_id, i): seq.integration_length_acq() | |||
for mod_id, mod in modules.items() | |||
for i, seq in enumerate(mod.sequencers) | |||
if hasattr(seq, "integration_length_acq") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be needed because non-QRM sequencers do not have this attribute and it gives an AttributeError
. A different potential fix is to loop over QRMs only here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, though the error is somehow upstream: I should have not passed non-QRM in here, so we may even just filter modules
before passing it to this function (to the point that even the type hint may be changed to Union[Qrm, QrmRf]
, instead of Module
).
In principle, we may want to filter out even sequences
. But that's just useless complexity, since non-QRM sequences are returning empty seq.integration_lengths
anyhow...
(in a sense, I'd like to pass down the minimal amount of information to any function - but also limiting the code complexity has a value)
I tried to comment out both status checks (sequencer and acquisition), in which case there is no error raised, but the list returned is filled only with In general, I'm running an extremely simple program, consisting of just the following two sequences: # 0-drive
move 0,R0 # init bin counter
move 0,R1 # init bin reset
move 10,R2 # init shots counter
wait_sync 4
start: play 0,1,40
wait 37860 # relaxation
move 4,R3
wait3: wait 65535
loop R3,@wait3
reset_ph # phase reset
add R0,1,R0 # bin increment
jge R2,1,@shots # skip bin reset - advance both counters
move R1,R0 # shots average: reset bin counter
shots: loop R2,@start # loop over shots
stop
# 0-acquisition
move 0,R0 # init bin counter
move 0,R1 # init bin reset
move 10,R2 # init shots counter
wait_sync 4
start: wait 40
acquire 0,R0,4
play 0,1,1996
wait 37860 # relaxation
move 4,R3
wait3: wait 65535
loop R3,@wait3
reset_ph # phase reset
add R0,1,R0 # bin increment
jge R2,1,@shots # skip bin reset - advance both counters
move R1,R0 # shots average: reset bin counter
shots: loop R2,@start # loop over shots
stop Still, it seems that the sequencers are not actually stopping, not even with a timeout of 1 minute (relaxation time is 300 us, and I'm asking for 10 shots...). I will try to reduce the relaxation time, to avoid at least that loop, and further simplify the program. Next, I could reduce shots to 1, and check with singleshot (which should drop the bin counter reset instruction). (to me, failing the status check means that it did not reach the |
Ok, I extract a snapshot of the modules involved (using some messy script...), filtering values which are not very relevant. I actually do not see anything astounding in them. The only thing I notice is the Now, continuing with our hypothesis that the problem is in the configs (and not in the program) we should try to extract something similar for the 0.1 driver, where everything is working. I still consider likely the problem to be in the configs, since an incomplete or faulty execution should raise a different error (as much as invalid assembly, and we have seen those errors). But, of course, I'm not certain of anything... {
'parameters': {
'present': True,
'out0_lo_freq': 4305409000.0,
'out1_lo_freq': None,
'out0_lo_en': None,
'out1_lo_en': None,
'out0_att': None,
'out1_att': None,
'out0_offset_path0': None,
'out0_offset_path1': None,
'out1_offset_path0': None,
'out1_offset_path1': None,
'marker0_inv_en': None,
'marker1_inv_en': None
},
'submodules': {
'0/drive': {
'address': PortAddress(slot=12, ports=(1, None), input=False),
'original_name': 'sequencer0',
'parameters': {
'connect_out0': None,
'connect_out1': None,
'sync_en': True,
'nco_freq': -200286028.0,
'nco_phase_offs': None,
'nco_prop_delay_comp': None,
'nco_prop_delay_comp_en': None,
'marker_ovr_en': None,
'marker_ovr_value': None,
'cont_mode_en_awg_path0': None,
'cont_mode_waveform_idx_awg_path0': None,
'upsample_rate_awg_path0': None,
'gain_awg_path0': None,
'offset_awg_path0': 0.0,
'cont_mode_en_awg_path1': None,
'cont_mode_waveform_idx_awg_path1': None,
'upsample_rate_awg_path1': None,
'gain_awg_path1': None,
'offset_awg_path1': 0.0,
'mod_en_awg': True
},
'submodules': {}
}
}
}
{
'parameters': {
'present': True,
'out0_in0_lo_freq': 5100000000.0,
'out0_in0_lo_en': None,
'in0_att': None,
'out0_att': None,
'in0_offset_path0': None,
'in0_offset_path1': None,
'out0_offset_path0': None,
'out0_offset_path1': None,
'scope_acq_avg_mode_en_path0': None,
'scope_acq_avg_mode_en_path1': None,
'scope_acq_sequencer_select': None,
'marker0_inv_en': None,
'marker1_inv_en': None
},
'submodules': {
'0/acquisition': {
'address': PortAddress(slot=19, ports=(1, None), input=True),
'original_name': 'sequencer0',
'parameters': {
'connect_out0': None,
'connect_acq': None,
'sync_en': True,
'nco_freq': 133350000.0,
'nco_phase_offs': None,
'nco_prop_delay_comp': None,
'nco_prop_delay_comp_en': None,
'marker_ovr_en': None,
'marker_ovr_value': None,
'cont_mode_en_awg_path0': None,
'cont_mode_waveform_idx_awg_path0': None,
'upsample_rate_awg_path0': None,
'gain_awg_path0': None,
'offset_awg_path0': 0.0,
'cont_mode_en_awg_path1': None,
'cont_mode_waveform_idx_awg_path1': None,
'upsample_rate_awg_path1': None,
'gain_awg_path1': None,
'offset_awg_path1': 0.0,
'mod_en_awg': True,
'demod_en_acq': True,
'integration_length_acq': 2000,
'thresholded_acq_rotation': 224.13726591596674,
'thresholded_acq_threshold': 0.0018611258806628425,
'thresholded_acq_marker_en': None,
'thresholded_acq_marker_address': None,
'thresholded_acq_marker_invert': None
},
'submodules': {}
},
'1/acquisition': {
'address': PortAddress(slot=19, ports=(1, None), input=True),
'original_name': 'sequencer1',
'parameters': {
'connect_out0': None,
'connect_acq': None,
'sync_en': True,
'nco_freq': -165508000.0,
'nco_phase_offs': None,
'nco_prop_delay_comp': None,
'nco_prop_delay_comp_en': None,
'marker_ovr_en': None,
'marker_ovr_value': None,
'cont_mode_en_awg_path0': None,
'cont_mode_waveform_idx_awg_path0': None,
'upsample_rate_awg_path0': None,
'gain_awg_path0': None,
'offset_awg_path0': 0.0,
'cont_mode_en_awg_path1': None,
'cont_mode_waveform_idx_awg_path1': None,
'upsample_rate_awg_path1': None,
'gain_awg_path1': None,
'offset_awg_path1': 0.0,
'mod_en_awg': True,
'demod_en_acq': True,
'integration_length_acq': None,
'thresholded_acq_rotation': 184.52909640628286,
'thresholded_acq_threshold': 0.0023868974750940646,
'thresholded_acq_marker_en': None,
'thresholded_acq_marker_address': None,
'thresholded_acq_marker_invert': None
},
'submodules': {}
}
}
} |
Some more information, collected interactively after an execution (in which I commented out the status checks in the (Pdb++) controller.cluster.get_system_status()
SystemStatus(status=<SystemStatuses.OKAY>, flags=[], slot_flags=SystemStatusSlotFlags())
(Pdb++) controller.cluster.get_operation_events()
0
(Pdb++) controller.cluster.get_assembler_status(12)
True
(Pdb++) controller.cluster.get_assembler_log(12)
'assembler finished successfully'
(Pdb++) controller.cluster.get_assembler_status(19)
True
(Pdb++) controller.cluster.get_num_system_error()
0
(Pdb++) controller.cluster.get_operation_complete()
True
(Pdb++) controller.cluster.get_assembler_log(19)
'assembler finished successfully'
(Pdb++) controller.cluster.get_sequencer_status(12, 0)
SequencerStatus(status=<SequencerStatuses.OKAY>, state=<SequencerStates.RUNNING>, info_flags=[], warn_flags=[], err_flags=[], log=[])
(Pdb++) controller.cluster.get_sequencer_status(19, 0)
SequencerStatus(status=<SequencerStatuses.OKAY>, state=<SequencerStates.RUNNING>, info_flags=[], warn_flags=[], err_flags=[], log=[])
(Pdb++) controller.cluster.get_sequencer_status(19, 1)
SequencerStatus(status=<SequencerStatuses.OKAY>, state=<SequencerStates.RUNNING>, info_flags=[], warn_flags=[], err_flags=[], log=[])
(Pdb++) str(hash(rx[0][1]))
'-4749133595706923953'
(Pdb++) controller.cluster.get_waveforms(12, 0)
{'-4749133595706923953,0': {'index': 0, 'data': [0.01834162510931492, 0.02468947507441044, 0.03271584212779999, 0.04266487807035446, 0.05478072538971901, 0.06924650073051453, 0.08618427067995071, 0.10559404641389847, 0.1273842602968216, 0.1512802541255951, 0.17685475945472717, 0.20355845987796783, 0.2306894063949585, 0.25733208656311035, 0.28263190388679504, 0.30561235547065735, 0.3252967894077301, 0.3409222662448883, 0.35175633430480957, 0.3572801947593689, 0.3572801947593689, 0.35175633430480957, 0.3409222662448883, 0.3252967894077301, 0.30561235547065735, 0.28263190388679504, 0.25733208656311035, 0.2306894063949585, 0.20355845987796783, 0.17685475945472717, 0.1512802541255951, 0.1273842602968216, 0.10559404641389847, 0.08618427067995071, 0.06924650073051453, 0.05478072538971901, 0.04266487807035446, 0.03271584212779999, 0.02468947507441044, 0.01834162510931492]}, '-4749133595706923953,1': {'index': 1, 'data': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]}}
(Pdb++) controller.cluster.get_waveforms(19, 0)
{'5296115781354797103,0': {'index': 0, 'data': [0.19989623129367828, 0.19989623129367828, 0.199896231293678287828, ... |
Actually, concerning the hanging, I now wonder whether it is entirely due to the inactive sequencers. (Pdb++) controller.cluster.get_sequencer_status(12, 0)
SequencerStatus(status=<SequencerStatuses.OKAY>, state=<SequencerStates.RUNNING>, info_flags=[], warn_flags=[], err_flags=[], log=[])
(Pdb++) controller.cluster.get_sequencer_status(19, 0)
SequencerStatus(status=<SequencerStatuses.OKAY>, state=<SequencerStates.RUNNING>, info_flags=[], warn_flags=[], err_flags=[], log=[])
(Pdb++) controller.cluster.get_sequencer_status(19, 1)
SequencerStatus(status=<SequencerStatuses.OKAY>, state=<SequencerStates.RUNNING>, info_flags=[], warn_flags=[], err_flags=[], log=[])
(Pdb++) controller.cluster.get_acquisition_status(19, 0)
False
(Pdb++) controller.cluster.get_acquisition_status(19, 1)
False I'll restart from filtering them as the first thing, and then investigate the other two suspicious evidences we collected:
The option is just to execute Qblox tutorials, in particular some portion of the binned acquisition one, to check if the settings in there are sufficient (or sufficient in our configuration). |
d7d9faa
to
294be90
Compare
@stavros11 in 9cffc9b8 I actually decided to preserve the configuration of all sequencers, with the idea that it may be relevant to always set things like sweetspots (and possibly only them...). Instead, I'm filtering them from execution. So:
|
I tried to run a minimal program, and inspect what was happening interactively. One example is the following: move 0,R0 # init bin counter
move 0,R1 # init bin reset
move 2,R2 # init shots counter
wait_sync 4
start: wait 10
play 0,1,40
wait 10 # relaxation
reset_ph # phase reset
add R0,1,R0 # bin increment
jge R2,1,@shots # skip bin reset - advance both counters
move R1,R0 # shots average: reset bin counter
shots: loop R2,@start # loop over shots
stop Actually, what I observed is that, over a certain threshold in instructions the sequencers preserve its state as
Thus, it seems that the program is correctly processed, as long as the queue is not filled, at which even the Q1 processor has to wait, in order for the real-time processor to process part of the queue. Now, given the reduced amount of instructions in the program above, there is little room for culprits.
To exclude (or confirm) 2., I will now try to filter out |
Ok, I have been able to finally acquire something: # discrimination
[Qibo 0.2.16|INFO|2025-02-13 18:57:50]: Loading platform iqm5q
[Qibo 0.2.16|INFO|2025-02-13 18:57:55]: Minimal execution time: 5.13485
results
{
129213355251152: array([0., 1., 0., 0., 1., 0., 0., 1., 1., 1., 0., 0., 1., 1., 1., 1., 1.,
1., 1., 0., 0., 0., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1.,
1., 1., 0., 1., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 0., 0., 1.,
0., 1., 1., 0., 1., 0., 0., 0., 1., 0., 0., 0., 0., 1., 0., 0., 1.,
1., 0., 1., 1., 1., 0., 1., 0., 1., 0., 1., 1., 0., 1., 0., 1., 0.,
1., 0., 0., 0., 1., 1., 1., 1., 0., 0., 1., 0., 1., 1., 0., 0., 0.,
1., 1., 1., 1., 0., 0., 1., 0., 0., 0., 0., 1., 0., 0., 1., 1., 0.,
0., 1., 1., 1., 1., 0., 1., 0., 0., 1., 1., 0., 0., 1., 1., 1., 0.,
1., 1., 0., 0., 0., 1., 1., 1., 0., 1., 0., 0., 0., 1., 1., 0., 0.,
0., 0., 1., 1., 1., 0., 1., 0., 1., 0., 0., 1., 0., 1., 1., 0., 0.])
}
# integration
[Qibo 0.2.16|INFO|2025-02-13 18:59:38]: Loading platform iqm5q
[Qibo 0.2.16|INFO|2025-02-13 18:59:43]: Minimal execution time: 5.13485
results
{
124150003914432: array([[-1.86614558e-04, -2.91157792e-04],
[ 1.06008793e-04, 5.78895945e-05],
[ 4.68978994e-05, -1.60723009e-04],
[ 3.66389839e-06, -4.49438202e-05],
[-2.73815340e-04, 1.41670738e-04],
[-5.30043967e-05, -6.00879336e-05],
[ 3.04103566e-04, -1.18466048e-04],
[ 2.80166097e-04, 1.70493405e-04],
[ 2.83829995e-04, 2.66243283e-05],
...
[-9.03761602e-06, 8.10942843e-05],
[ 1.42159257e-04, -1.36541280e-04],
[ 1.24816805e-04, -8.86663410e-05],
[-3.00928188e-04, 2.63312164e-04],
[-3.58085002e-04, 5.22716170e-05]])
} The problem was actually 2.: we were waiting forever for synchronization. 😱😫🤬 I'm not yet sure why synchronization is not working. But, at least, I finally discovered which was the single issue preventing us from running. |
Ok, bc1aa77 may have permanently solved the problem! 🎉 |
At this point, I would even propose to merge this PR into #1088, and continue in there (or open a new one, if needed) |
bc1aa77
to
8fc8b13
Compare
Thanks for all the effort in debugging and solving the problem! My minimal example also works now and returns some acquisition, both with
I agree with merging since the initial goal is achieved. I will check the latest status of calibration of iqm5q with 0.1 and start testing some qibocal routines to also verify that the results make sense. |
@@ -190,7 +194,7 @@ def _execute( | |||
if len(seq_acqs) == 0: | |||
# not an acquisition channel, or unused | |||
continue | |||
self.cluster.get_acquisition_status(slot, seq, timeout=10) | |||
self.cluster.get_acquisition_status(slot, seq, timeout=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure exactly how timeout
is used here, but if it fails if the acquisition did not finish before that time, then one minute may actually be short for many experiments in practice. If that is the case, I am not sure if there is a proper way to set it, as experiments can be arbitrarily long (at least within instrument limitations) and the only estimate is the one we calculate and log in the platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timeout
is just controlling the length of a while
loop, polling for completion
https://gitlab.com/qblox/packages/software/qblox_instruments/-/blob/3a346b5b5f5ba8c6b0b223f0a882fa0002cf2690/qblox_instruments/native/generic_func.py?page=3#L2759-2773
It is not waiting until the end of the timeout
time to return (if the condition is satisfied before, it immediately returns), but we do not need to put an arbitrary long time there either, since in principle we are already waiting for the length of the experiment with the time.sleep()
above.
So, this is some extra buffer time, and when I wrote 10
initially I thought they were some conservative 10 s. 1 m is even more...
In any case, we can fine tune this value, but in practice it should change little to nothing.
I'm merging, but we can continue the discussions both here or in the PR, as needed :) |
Still draft since I was not able to fully execute yet. I am currently trying the following minimal example:
I will also put a comment with the relevant error under every attempted fix.