Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not working with Gemma2 9B IT #12

Open
pratik443 opened this issue Jan 16, 2025 · 1 comment
Open

Not working with Gemma2 9B IT #12

pratik443 opened this issue Jan 16, 2025 · 1 comment

Comments

@pratik443
Copy link

pratik443 commented Jan 16, 2025

Hello, thank you for the amazing work and code.
So I'm trying to adapt the code to Gemma 2 9B it model, after changing the prompts to chat template required and running the code it gives following error

attn_weights = attn_weights + causal_mask
RuntimeError: The size of tensor a (7538) must match the size of tensor b (46) at non-singleton dimension 3
seems like past_key_values and current inputs is creating this problem, apparently setting usePrompt as True the generation works, but then just using cache with questions on hotpot dataset setting usePrompt as False dosent work

Does this imply that need to make changes in the generate function for some other issue?

@SpeedReach
Copy link
Collaborator

Hi @pratik443 ,

Could you provide the full script so we could reproduce the error?
My first guess is the KV Cache in Gemma may have different dimension representations, but would need experiment for confirmation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants