WIP: slimmed-down Levenberg-Marquardt for nonlinear least squares #500
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
pyGSTi's nonlinear least squares solver is a custom implementation with Levenberg-Marquardt with a dizzying array of options. The core algorithm is over 800 lines long and has 10 levels of indentation at its deepest point. The complexity of this existing implementation makes it basically impossible to extend.
This PR introduces slimmed-down infrastructure for nonlinear least squares. I started by copying customlm.py into a new file called simplerlm.py, then I removed one feature at a time in a series of commits. "Removal" of a feature meant making the default behavior for a given option the only behavior. I haven't tested these changes yet but they should work just fine.
This PR has two goals.
Notes
I tried to use ChatGPT to help split the monolithic custom_leastsq function in customlm.py into simpler functions. This turned out to not work for reasons that I'll discuss off-GitHub if people are interested.
For the curious, I'm doing this because I want to extend our nonlinear least squares solver in two ways. First, I want it to support pytorch Tensors; right now we only support numpy arrays and whatever home-cooked thing we have for distributed-memory computations. Second, I want to experiment with optimization algorithms that rarely (if ever!) require evaluating the full Jacobian of circuit outcome probabilities.