Replies: 1 comment
-
tl;dr how to cache part of compute graph |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Here is some example code that computes matmul twice.
Output:
Is there any way this can be automatically cached?
This is only a simple example. With a more complex scenario, caching is not so easy.
The Strassen Algorithm need to preprocess the matrices. In a model,$A*B$ where $A$ is fixed (model weights) and $B$ is dependent on input, the preprocess steps that only dependent on $A$ can be cached. This algorithm is recursive, so a single matrix multiplication may be expanded to a balanced tree -shaped compute graph.
Beta Was this translation helpful? Give feedback.
All reactions