Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update sdpa function with enable_gqa=True #191

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

jainapurva
Copy link

For the llama model, in the sdpa function call, set enable_gqa=True to use the inbuilt grouped query attention functionality

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 13, 2024
@jainapurva jainapurva requested a review from drisspg July 13, 2024 03:56
@yanboliang
Copy link
Contributor

yanboliang commented Jul 26, 2024

I think we should wait a bit to get this in, since a lot of users are still using the old version of PT which doesn't support enable_gqa. But I'm interested how much perf gain we have after enabling the builtin gqa, do you have numbers on A100?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants