about shearing params config #67

LoverLost · 2024-04-19T14:15:44Z

Hello, I would like to ask a question about parameter settings.I want to prune the llama2 model without changing the hidden_size, which means it is fixed at 4096. However i want to change the num_heads of attention,which means i want to prune the q/k/v/o from 4096 x 4096 to 4096 x 2048.Can i use the code to do this without change something? Also i noticed that in zs_block may have 'qk_head_dim_z', What does this thing do?

xiamengzhou · 2024-06-08T04:00:02Z

Hi @LoverLost sorry for the late reply!

qk_head_dim_z is not supported in the current code yet, and it was supposed to prune head dimensions instead of full heads. The current code supports pruning only the heads without pruning the hidden dimensions. You need to remove hidden from the prune_params. Let me know if you encounter any issues!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about shearing params config #67

about shearing params config #67

LoverLost commented Apr 19, 2024 •

edited

Loading

xiamengzhou commented Jun 8, 2024

about shearing params config #67

about shearing params config #67

Comments

LoverLost commented Apr 19, 2024 • edited Loading

xiamengzhou commented Jun 8, 2024

LoverLost commented Apr 19, 2024 •

edited

Loading