Optimize ops.triangular_solve() for 1x1 matrices #564
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Addresses #559
This adds a scalar special case for the torch implementation of
ops.triangular_solve
. This op is frequently used in Gaussian funsor variable elimination, and the scalar case is very common.Profiling results
I've tested on both a profiling jig (big speedup!) and the full pyro-cov model (minor speedup). The jig is invoked via
The full model is invoked with the jit enabled.
Before: full model speed = 11.14 iters/sec
After: full model speed = 11.75 iters/sec
Tested