Why we use q,k,k_max = (64, 64, 32) / (64, 128,128) / (64, 128, 2e16) for sm80? #847
ZhangDY-6483
started this conversation in
Ideas
Replies: 1 comment 3 replies
-
Hi, |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In generate_kernel.py, we generate one specific arch's kernel with some fixed shapes, such as q,k,k_max = (64, 64, 32) / (64, 128,128) / (64, 128, 2e16) for sm80. I'm not sure why these three shapes are selected but not others?
![image](https://private-user-images.githubusercontent.com/64682152/265308115-007ac4fb-3a83-42dd-8a55-65f81b563671.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0NTQ2MTgsIm5iZiI6MTczOTQ1NDMxOCwicGF0aCI6Ii82NDY4MjE1Mi8yNjUzMDgxMTUtMDA3YWM0ZmItM2E4My00MmRkLThhNTUtNjVmODFiNTYzNjcxLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTMlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEzVDEzNDUxOFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWRiMGI1NWVhNmNkODE5MzgyNWEzOWVhZjhjOWQ3YjJiYTI0OTAyNWQ2ZTVhYjQ5MjhlZjczMjAwZjRkNWVmZTcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.NC7H3-q_AYvT3ujZrzBWxQVaHrQPGobj_Knx-VAhMWc)
(Did we emulate all the possible shape and test the performance of them and select the best combination? )
Beta Was this translation helpful? Give feedback.
All reactions