Grouped GEMM where each subproblem is another grouped gemm instance #1452

masahi · 2024-04-02T12:06:22Z

masahi
Apr 2, 2024

Hi, recently I encountered a need for "nested" grouped GEMM twice. This is where each grouped-gemm subproblem is another grouped gemm instance.

A good example is composing grouped GEMM for batched LoRA over MoE grouped GEMM. The outer group corresponds to different LoRAs, and each LoRA weight has an inner group of experts. For this example specifically, I found a way to encode the nested grouped gemm problem into a normal, but bigger grouped gemm problem.

But today I hit another case that I cannot get around using the existing API. So it made me wonder - has anyone encountered a similar situation? Is it feasible to implement such a "nested" grouped gemm kernel?

cc @hwu36

hwu36 · 2024-04-02T12:23:59Z

hwu36
Apr 2, 2024
Maintainer

Why cannot you flatten it to be not nested in the 2nd case?

0 replies

masahi · 2024-04-02T20:52:22Z

masahi
Apr 2, 2024
Author

After more thinking, maybe it can. The first case is easily flattened, since batched LoRA and MoE are equivalent from a grouped GEMM perspective.

In the second case, the outer and the inner grouped gemms are very different in nature, and it took me a while to realize that, at least conceptually, they can be framed as another instance of nested grouped gemm. For example, the input A matrix has shape (M, K * 3) and is grouped along the columns for the outer problem (num_groups = 3), and along the rows for the inner problem (num_groups = num_loras). But even then, I realized that the outer and the inner group can share the same stride parameter, so that they can be flattened.

I'll see if I can get it actually working. If the conclusion turns out to be that any nested grouped gemm problem can be flattened, that'd be a very interesting learning for me.

0 replies

hwu36 · 2024-04-18T18:39:51Z

hwu36
Apr 18, 2024
Maintainer

just remember that the ultimate goal of group gemm is to saturate the gpu. if you put too many gemms in a group gemm, it may not help. in this case, you may be able to run the inner group gemm first, then the outer one.

1 reply

masahi Apr 18, 2024
Author

Ok, for my case the number of total problems is small, so this is useful. This week I got "nested grouped gemm" for this case working. I think this is very cool!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grouped GEMM where each subproblem is another grouped gemm instance #1452

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Grouped GEMM where each subproblem is another grouped gemm instance #1452

masahi Apr 2, 2024

Replies: 3 comments · 1 reply

hwu36 Apr 2, 2024 Maintainer

masahi Apr 2, 2024 Author

hwu36 Apr 18, 2024 Maintainer

masahi Apr 18, 2024 Author

masahi
Apr 2, 2024

Replies: 3 comments 1 reply

hwu36
Apr 2, 2024
Maintainer

masahi
Apr 2, 2024
Author

hwu36
Apr 18, 2024
Maintainer

masahi Apr 18, 2024
Author