-
Notifications
You must be signed in to change notification settings - Fork 624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve and tidy implementation of hadamard gradient #6928
base: master
Are you sure you want to change the base?
Conversation
Hello. You may have forgotten to update the changelog!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be agnostic about list-versus-tuple consistently in our gradient spec. In the case of the hadamard grad, we initially create lists because we are iterating and pushing things in to the list. As much as I prefer tuples to lists, we shouldn't have to iteratively re-construct the whole data structure just to put it back into a tuple.
diff_methods = find_and_validate_gradient_methods(tape, "analytic", trainable_params) | ||
trainable_param_indices = choose_trainable_params(tape, argnum) | ||
diff_methods = { | ||
idx: _try_zero_grad_from_graph_or_get_grad_method(tape, tape.trainable_params[idx], True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We no longer use Operator.grad_method
to say whether or not we can do hadamard grad with it. We just care if there's a generator.
argnums = argnum or tape.trainable_params | ||
g_tapes = [] | ||
coeffs = [] | ||
generators_per_parameter = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this to be a little bit more specific than "gradient data".
|
||
for gen in generators: | ||
if isinstance(trainable_op, qml.Rot): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw no reason to continue "natively" differenitating Rot
. Decomposing it first is much cleaner in terms of code and abstraction, and consumes basically the same resources. It's actually much easier to inspect the gradient tapes too.
coeffs.extend(sub_coeffs) | ||
|
||
num_tape = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_tape
just ends up being len(generators)
after we iterate through all the generators..
measurements.append(qml.probs(op=obs_new)) | ||
|
||
new_tape = qml.tape.QuantumScript(ops=ops, measurements=measurements, shots=tape.shots) | ||
_rotations, _measurements = qml.tape.tape.rotations_and_diagonal_measurements(new_tape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have zero reason to do this. We can handle observables in any pauli measurement basis just fine now.
multi_measurements = len(tape.measurements) > 1 | ||
multi_params = len(tape.trainable_params) > 1 | ||
|
||
def _get_pauli_generators(trainable_op): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now rely on the Operator
interface to have a generic algorithm for getting the generator as a linear combination of pauli's, instead of having to have a bespoke registration for each operator. This allows to differentiate any time evolution op.
return qml.math.tensordot(res, projector, axes=[[1], [0]]) | ||
|
||
|
||
def processing_fn(results: qml.typing.ResultBatch, tape, coeffs, generators_per_parameter): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulled out into its own function to make it more clear what additional variables we rely on and reduce the overall complexity of the _expval_hadamard_grad
.
# Reordering to match the right shape for multiple measurements | ||
grads_reorder = [[0] * len(tape.trainable_params) for _ in range(len(tape.measurements))] | ||
|
||
for i in range(len(tape.measurements)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By preparing the data structure in the correct shape, we don't need a follow on step to reshape everything.
if len(tape.measurements) == 1: | ||
return grads[0][0] if len(tape.trainable_params) == 1 else grads[0] | ||
return tuple(g[0] for g in grads) if len(tape.trainable_params) == 1 else grads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate our return type spec right about now.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6928 +/- ##
==========================================
- Coverage 99.59% 99.59% -0.01%
==========================================
Files 480 480
Lines 45505 45455 -50
==========================================
- Hits 45320 45270 -50
Misses 185 185 ☔ View full report in Codecov by Sentry. |
@@ -263,27 +268,6 @@ def hadamard_grad( | |||
>>> params = jax.numpy.array([0.1, 0.2, 0.3]) | |||
>>> jax.jacobian(circuit)(params) | |||
Array([-0.3875172 , -0.18884787, -0.38355704], dtype=float64) | |||
.. note:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed 👀
Co-authored-by: Yushao Chen (Jerry) <[email protected]>
Context:
We are intending to make some improvements to the hadamard gradient to support variations of the differentiation method. To prepare us for these extensions, I've made some minor reworking.
Description of the Change:
Rot
operations are now decomposed instead of having bespoke logic. It's basically the same resources at the end, and makes the code substantially cleaner.hadamard_grad
now works with anything with a generator, instead of a manually managed list of simple operators. This will set us to apply hadamard grad to more complicated hamiltonian time evolution problems.The postprocessing function is extracted to global scope. This makes it more clear which variables are needed from the preprocessing step.
The postprocessing is slightly reworked to have fewer branches, and to no longer need to reorder everything at the end. By populating the
grad
structure in the correct shape to begin with, we can avoid a potentially expensive and badly scaling step.The postprocessing now returns lists instead of tuples. Our return-spec is supposed to be agnostic to lists versus tuples. If lists made sense for creating the structure to begin with, we shouldn't have to go back and cast them. That's an extra step we don't need, if we just allow the other components to expect a list. Yes, I do generally prefer tuple's to lists, but in this case, the list makes sense.
Benefits:
More generic code that is more versatile and easier to maintain.
Possible Drawbacks:
Related GitHub Issues: