Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recover attention scores #70

Open
carlomarxdk opened this issue Jun 1, 2021 · 3 comments
Open

Recover attention scores #70

carlomarxdk opened this issue Jun 1, 2021 · 3 comments

Comments

@carlomarxdk
Copy link

Is it possible to recover the attention scores from the Fast Attention module?

@gaganbahga
Copy link

I don't believe that's possible because the order of computation is (Q' (K'^T V)). Would be interesting to know someone has a different idea/workaround.

@WintrumWang
Copy link

In performer paper, the author use a special "V", which is a diagonal matrix (one-hot indicators), then the attention outputs just equal attention scores. I suggest you read the paragraphs around Figure 10 in the paper. However, I have trouble in the implementation of it, because it is confusing to pass both attention scores and results to other functions/classes meantime.

@WintrumWang
Copy link

In performer paper, the author use a special "V", which is a diagonal matrix (one-hot indicators), then the attention outputs just equal attention scores. I suggest you read the paragraphs around Figure 10 in the paper. However, I have trouble in the implementation of it, because it is confusing to pass both attention scores and results to other functions/classes meantime.

@lucidrains Could you please help us about the implementation of obtain attention weights?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants