Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Causal performer slower than causal regular attention #66

Open
JamesDeAntonis opened this issue Apr 23, 2021 · 3 comments
Open

Causal performer slower than causal regular attention #66

JamesDeAntonis opened this issue Apr 23, 2021 · 3 comments

Comments

@JamesDeAntonis
Copy link

For some reason, our causal performer runs slower than that of causal regular attention. You observe that performer is faster, even in the causal case right? Curious how to troubleshoot this (we don't use the full PerformerLM, just CrossAttention and SelfAttention, not sure if that's relevant)

@lucidrains
Copy link
Owner

@JamesDeAntonis do you mean on training or eval?

@JamesDeAntonis
Copy link
Author

JamesDeAntonis commented Apr 26, 2021

We observed in both. I heard from here that the reason is caching? Are you still planning to implement it?

@lucidrains
Copy link
Owner

@JamesDeAntonis training is as fast as it can be - basically, if you are training at less than 2048 context length, you should expect it to be same or slower

eval should be really fast though, and that's something i could work on. it should be as fast as an RNN in the end. i'll take a look at it later this week!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants