Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve 2q block collection via 1q quaternion-based collection #13649

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

jakelishman
Copy link
Member

Summary

This is a small series of patches, which could be split into 2-3 separate PRs if preferred. There are two main goals:

  • introduce a quaternion-based mechanism for working with the $U(2)$ group (1q gates) matrix-free (uses 5 floats, rather than 8 + nd-matrix overhead).
  • improve the collection speed of ConsolidateBlocks

This doesn't do everything that could be done for ConsolidateBlocks, but I've stopped at the point where the most natural changes to me now are quite a bit larger.

Using this script:

from qiskit import transpile
from qiskit.converters import circuit_to_dag
from qiskit.circuit import library as lib
from qiskit.transpiler.passes import ConsolidateBlocks

pass_ = ConsolidateBlocks(basis_gates=["rz", "sx", "ecr"], force_consolidate=True)

# Some arbitary circuit with lots of runs that use both 1q runs and 2q runs
# with gates in both directions (transpile will put them all in one direction).
qc_base = transpile(
    lib.quantum_volume(100, 100, seed=0),
    basis_gates=["rz", "sx", "ecr"],
    seed_transpiler=0,
)
qc = qc_base.copy_empty_like()
flip = False
for inst in qc_base.data:
    if len(inst.qubits) == 2:
        if flip:
            inst = inst.replace(qubits=inst.qubits[::-1])
        flip = not flip
    qc._append(inst)

dags = [circuit_to_dag(qc, copy_operations=False) for _ in [None]*100]
%time for dag in dags: pass_.run(dag)

I see a modest (~10%) improvement in the pass time, going from ~117ms to ~106ms.

Details and comments

Individual commit notes:

Add versor-based representation of 1q gates

1q gates are members of the group U(2), which we can represent as a scalar phase term and a member of SU(2). The members of SU(2) can be represented by versors (also called unit quaternions, but I got tired of typing that all the time...).

This adds a representation of versors and the group action to the Rust code, and ways to convert from matrix-based forms to the them.

This commit introduces nalgebra as a dependency, to use its quaternion logic. This is a relatively heavy dependency, especially for something as simple as quaternions, but some of this is in anticipation of moving more matrix code to the static matrices of nalgebra, rather than the too-dynamic-for-our-needs ones of ndarray; faer also offers static matrices, but its APIs continue to heavily fluctuate between versions, and it requires ever higher MSRVs.

Use quaternions in 1q block collection

Switch the inner algorithm of ConsolidateBlocks to use the quaternion form for single-qubit matrix multiplications. This offered a few percentage-points speedup for the block collection for typical rz-sx-rz-sx-rz-type runs in quantum-volume-like collections.

Avoid unnecessary allocations in qargs lookup

Switch the lookup logic for the qargs in the block collector to determine the qargs ordering without additional heap allocations. This offers another modest (~4%) improvement in collection performance for large runs.

Avoid allocations in simple matrix operations

Producing the Kronecker product of the two single-qubit matrices from the versor representation is trivially calculable, and can be written into an existing allocation. Similarly, switching the qubit order of a 2q matrix involves only six swaps, and does not need to allocate a new matrix if one is already available.

There are lots of places remaining in this code where more matrix allocations could be avoided. If nothing else, it should be possible to allocate only three 2q matrices in total, and keep shuffling the labelling of them when doing A.B -> C. ndarray does not make this easy, though; nalgebra and faer both have better interfaces for doing this, but currently all our matrix code in the Operation trait is in terms of ndarray.

1q gates are members of the group U(2), which we can represent as a
scalar phase term and a member of SU(2).  The members of SU(2) can be
represented by versors (also called unit quaternions, but I got tired of
typing that all the time...).

This adds a representation of versors and the group action to the Rust
code, and ways to convert from matrix-based forms to the them.

This commit introduces `nalgebra` as a dependency, to use its
quaternion logic.  This is a relatively heavy dependency, especially for
something as simple as quaternions, but some of this is in anticipation
of moving more matrix code to the static matrices of `nalgebra`, rather
than the too-dynamic-for-our-needs ones of `ndarray`; `faer` also offers
static matrices, but its APIs continue to heavily fluctuate between
versions, and it requires ever higher MSRVs.
Switch the inner algorithm of `ConsolidateBlocks` to use the quaternion
form for single-qubit matrix multiplications.  This offered a few
percentage-points speedup for the block collection for typical
`rz-sx-rz-sx-rz`-type runs in quantum-volume-like collections.
Switch the lookup logic for the qargs in the block collector to
determine the qargs ordering without additional heap allocations.  This
offers another modest (~4%) improvement in collection performance for
large runs.
Producing the Kronecker product of the two single-qubit matrices from
the versor representation is trivially calculable, and can be written
into an existing allocation.  Similarly, switching the qubit order of a
2q matrix involves only six swaps, and does not need to allocate a new
matrix if one is already available.

There are lots of places remaining in this code where more matrix
allocations could be avoided.  If nothing else, it should be possible to
allocate only three 2q matrices in total, and keep shuffling the
labelling of them when doing `A.B -> C`.  `ndarray` does not make this
easy, though; `nalgebra` and `faer` both have better interfaces for
doing this, but currently all our matrix code in the `Operation` trait
is in terms of `ndarray`.
@jakelishman jakelishman added performance mod: quantum info Related to the Quantum Info module (States & Operators) Changelog: None Do not include in changelog Rust This PR or issue is related to Rust code in the repository mod: transpiler Issues and PRs related to Transpiler labels Jan 10, 2025
@jakelishman jakelishman added this to the 2.0.0 milestone Jan 10, 2025
@jakelishman jakelishman requested a review from a team as a code owner January 10, 2025 13:47
@qiskit-bot
Copy link
Collaborator

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core

@jakelishman
Copy link
Member Author

There's also more places we could make use of the versor representation, like in 1q gate optimisation, but that pass currently involves passing matrices through many different parts of its interface with itself, so it'll be more complicated to modify.

... which were necessary because I'd borked the matrix calculations and
forgotten to write any tests of them.
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some quick high level comments. I missed the update and don't want these lost by github weirdness. I'll review in more depth later.

@@ -0,0 +1,18 @@
// This code is part of Qiskit.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call this directory quantum_info to match what we do in python?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind particularly - I read quantum_info in my head as qi most of the time anyway haha. I don't think we necessarily must match Python space, but if you prefer it for consistency I don't have any issues changing it.

// copyright notice, and modified files need to carry a notice indicating
// that they have been altered from the originals.

//! Quantum-information and linear-algebra related functionality, typically used as drivers for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to move over some linear algebra functionality like: https://github.com/Qiskit/qiskit/blob/main/crates/accelerate/src/utils.rs and https://github.com/Qiskit/qiskit/blob/main/crates/accelerate/src/synthesis/linear/utils.rs (although that first one might not be needed anymore).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm happy in a follow-up to move a few other bits over. I think there's other loose files and bits and bobs that could probably move into it too, just to keep things a bit more localised.

Comment on lines +22 to +23
const COS_PI_8: f64 = 0.9238795325112867;
const SIN_PI_8: f64 = 0.3826834323650898;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I thought you didn't like PI_8 variable naming for PI / 8 :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was aiming for consistency, really, but looking again I should have called it COS_FRAC_PI_8, perhaps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's probably the more consistent name with the built-in f64 consts and is harder to mess up by mistake.

@coveralls
Copy link

coveralls commented Jan 10, 2025

Pull Request Test Coverage Report for Build 12718409257

Details

  • 290 of 332 (87.35%) changed or added relevant lines in 3 files are covered.
  • 10 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.01%) to 88.906%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/accelerate/src/convert_2q_block_matrix.rs 78 81 96.3%
crates/accelerate/src/qi/versor_gate.rs 211 250 84.4%
Files with Coverage Reduction New Missed Lines %
crates/accelerate/src/unitary_synthesis.rs 1 93.18%
crates/qasm2/src/lex.rs 3 92.98%
crates/qasm2/src/parse.rs 6 97.15%
Totals Coverage Status
Change from base Build 12710188694: -0.01%
Covered Lines: 79660
Relevant Lines: 89600

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog mod: quantum info Related to the Quantum Info module (States & Operators) mod: transpiler Issues and PRs related to Transpiler performance Rust This PR or issue is related to Rust code in the repository
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants