-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve 2q block collection via 1q quaternion-based collection #13649
base: main
Are you sure you want to change the base?
Conversation
1q gates are members of the group U(2), which we can represent as a scalar phase term and a member of SU(2). The members of SU(2) can be represented by versors (also called unit quaternions, but I got tired of typing that all the time...). This adds a representation of versors and the group action to the Rust code, and ways to convert from matrix-based forms to the them. This commit introduces `nalgebra` as a dependency, to use its quaternion logic. This is a relatively heavy dependency, especially for something as simple as quaternions, but some of this is in anticipation of moving more matrix code to the static matrices of `nalgebra`, rather than the too-dynamic-for-our-needs ones of `ndarray`; `faer` also offers static matrices, but its APIs continue to heavily fluctuate between versions, and it requires ever higher MSRVs.
Switch the inner algorithm of `ConsolidateBlocks` to use the quaternion form for single-qubit matrix multiplications. This offered a few percentage-points speedup for the block collection for typical `rz-sx-rz-sx-rz`-type runs in quantum-volume-like collections.
Switch the lookup logic for the qargs in the block collector to determine the qargs ordering without additional heap allocations. This offers another modest (~4%) improvement in collection performance for large runs.
Producing the Kronecker product of the two single-qubit matrices from the versor representation is trivially calculable, and can be written into an existing allocation. Similarly, switching the qubit order of a 2q matrix involves only six swaps, and does not need to allocate a new matrix if one is already available. There are lots of places remaining in this code where more matrix allocations could be avoided. If nothing else, it should be possible to allocate only three 2q matrices in total, and keep shuffling the labelling of them when doing `A.B -> C`. `ndarray` does not make this easy, though; `nalgebra` and `faer` both have better interfaces for doing this, but currently all our matrix code in the `Operation` trait is in terms of `ndarray`.
One or more of the following people are relevant to this code:
|
There's also more places we could make use of the versor representation, like in 1q gate optimisation, but that pass currently involves passing matrices through many different parts of its interface with itself, so it'll be more complicated to modify. |
... which were necessary because I'd borked the matrix calculations and forgotten to write any tests of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some quick high level comments. I missed the update and don't want these lost by github weirdness. I'll review in more depth later.
@@ -0,0 +1,18 @@ | |||
// This code is part of Qiskit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call this directory quantum_info
to match what we do in python?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind particularly - I read quantum_info
in my head as qi
most of the time anyway haha. I don't think we necessarily must match Python space, but if you prefer it for consistency I don't have any issues changing it.
// copyright notice, and modified files need to carry a notice indicating | ||
// that they have been altered from the originals. | ||
|
||
//! Quantum-information and linear-algebra related functionality, typically used as drivers for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to move over some linear algebra functionality like: https://github.com/Qiskit/qiskit/blob/main/crates/accelerate/src/utils.rs and https://github.com/Qiskit/qiskit/blob/main/crates/accelerate/src/synthesis/linear/utils.rs (although that first one might not be needed anymore).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm happy in a follow-up to move a few other bits over. I think there's other loose files and bits and bobs that could probably move into it too, just to keep things a bit more localised.
const COS_PI_8: f64 = 0.9238795325112867; | ||
const SIN_PI_8: f64 = 0.3826834323650898; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, I thought you didn't like PI_8
variable naming for PI / 8 :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was aiming for consistency, really, but looking again I should have called it COS_FRAC_PI_8
, perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's probably the more consistent name with the built-in f64 consts and is harder to mess up by mistake.
Pull Request Test Coverage Report for Build 12718409257Details
💛 - Coveralls |
Summary
This is a small series of patches, which could be split into 2-3 separate PRs if preferred. There are two main goals:
ConsolidateBlocks
This doesn't do everything that could be done for
ConsolidateBlocks
, but I've stopped at the point where the most natural changes to me now are quite a bit larger.Using this script:
I see a modest (~10%) improvement in the pass time, going from ~117ms to ~106ms.
Details and comments
Individual commit notes:
Add versor-based representation of 1q gates
1q gates are members of the group U(2), which we can represent as a scalar phase term and a member of SU(2). The members of SU(2) can be represented by versors (also called unit quaternions, but I got tired of typing that all the time...).
This adds a representation of versors and the group action to the Rust code, and ways to convert from matrix-based forms to the them.
This commit introduces
nalgebra
as a dependency, to use its quaternion logic. This is a relatively heavy dependency, especially for something as simple as quaternions, but some of this is in anticipation of moving more matrix code to the static matrices ofnalgebra
, rather than the too-dynamic-for-our-needs ones ofndarray
;faer
also offers static matrices, but its APIs continue to heavily fluctuate between versions, and it requires ever higher MSRVs.Use quaternions in 1q block collection
Switch the inner algorithm of
ConsolidateBlocks
to use the quaternion form for single-qubit matrix multiplications. This offered a few percentage-points speedup for the block collection for typicalrz-sx-rz-sx-rz
-type runs in quantum-volume-like collections.Avoid unnecessary allocations in qargs lookup
Switch the lookup logic for the qargs in the block collector to determine the qargs ordering without additional heap allocations. This offers another modest (~4%) improvement in collection performance for large runs.
Avoid allocations in simple matrix operations
Producing the Kronecker product of the two single-qubit matrices from the versor representation is trivially calculable, and can be written into an existing allocation. Similarly, switching the qubit order of a 2q matrix involves only six swaps, and does not need to allocate a new matrix if one is already available.
There are lots of places remaining in this code where more matrix allocations could be avoided. If nothing else, it should be possible to allocate only three 2q matrices in total, and keep shuffling the labelling of them when doing
A.B -> C
.ndarray
does not make this easy, though;nalgebra
andfaer
both have better interfaces for doing this, but currently all our matrix code in theOperation
trait is in terms ofndarray
.