Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformer number of heads #172

Open
bitfort opened this issue Jan 17, 2019 · 4 comments
Open

Transformer number of heads #172

bitfort opened this issue Jan 17, 2019 · 4 comments
Labels
Backlog An issue to be discussed in a future Working Group, but not the immediate next one.

Comments

@bitfort
Copy link

bitfort commented Jan 17, 2019

Currently number of heads in attention if 16, proposal is to move to 8 heads with the understanding that it achieves the same quality with better performance.

@bitfort bitfort added the Next Meeting Item to be discussed in the next Working Group label Jan 17, 2019
@bitfort bitfort added the AI There is an action item here. label Jan 31, 2019
@bitfort
Copy link
Author

bitfort commented Jan 31, 2019

SWG Notes:

This reduces computation and Google has data to show this does not reduce quality. Also, this is commonly used in production.

AI(Google) spread sheet showing difference in runs.

@danielcdh
Copy link

We drafted a documents to analyze the computation requirements as well as convergence experiments with this change: https://docs.google.com/a/google.com/document/d/e/2PACX-1vR3qcsQSL6r4xvHQP9-R40Rq33qSF5yqm47esWRTbRPzremPYs6-ZNqpSypiyYXyRdE-D7VLayEUY_c/pub

@David-Levinthal
Copy link

David-Levinthal commented Feb 21, 2019 via email

@petermattson
Copy link
Contributor

SWG: Seems tentatively OK but want to check with customers and will finalize next week.

@bitfort bitfort added Backlog An issue to be discussed in a future Working Group, but not the immediate next one. and removed AI There is an action item here. Next Meeting Item to be discussed in the next Working Group labels Feb 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backlog An issue to be discussed in a future Working Group, but not the immediate next one.
Projects
None yet
Development

No branches or pull requests

4 participants