Fix the issue of modeling_llama.LlamaRotaryEmbedding.forward being changed to an invalid method. #155
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The lines are called multiple times when export llama 3_2_3b:
https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/_shared/llama3/model.py#L300-L302
Both modeling_llama.LlamaRotaryEmbedding.forward and modeling_llama.LlamaRotaryEmbedding._original_forward will be set to bypass_RotaryEmbedding, and bypass_RotaryEmbedding just return position_ids, that's not correct values (its shape and its contained data are incorrect), please refer to:
https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/_shared/llama3/model.py#L125-L137
This line will raise exception when prepare the inference jobs:
emb_size = embeddings[0].size(-1) // 2
So, this patch adds the condition to ensure
modeling_llama.LlamaRotaryEmbedding._original_forward only be set to modeling_llama.LlamaRotaryEmbedding.forward once.
Local tests are passed:
![image](https://private-user-images.githubusercontent.com/19808744/406092258-c9325798-8a0f-4928-ad67-ca3ff7cf520f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkxNDYzMjQsIm5iZiI6MTczOTE0NjAyNCwicGF0aCI6Ii8xOTgwODc0NC80MDYwOTIyNTgtYzkzMjU3OTgtOGEwZi00OTI4LWFkNjctY2EzZmY3Y2Y1MjBmLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEwVDAwMDcwNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTI3MTQxOWJkZTY1N2Y2NjE2YTRkNWZiOGE3YjYxOThjNWUxNjVlNzMzY2M1ODM3NmVlNDg5Y2M2ZGYzZjNhNjImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.HwvgbjjeVljFvmFVpcWUwniobpfemm4GVBlc9tt1iiM)
The step of inference jobs works correctly,