Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converted PrefixLM HF snapshot must enable cache for generation in config #780

Open
timsteuer opened this issue Dec 6, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@timsteuer
Copy link

Environment

  • llmfoundry:latest

To reproduce

Steps to reproduce the behavior:

  1. Train a prefix-lm
  2. Convert it to Huggingface via llm-foundry/scripts/inference/convert_composer_to_hf.py
  3. Try to generate texts with the HF snapshot

-> When generating, the model throws an exception that use_cache must be enabled in the HF config.

Expected behavior

Model generates output texts.
Manually editing the HF config and enabling the cache did the trick for me.

@timsteuer timsteuer added the bug Something isn't working label Dec 6, 2023
@dakinggg
Copy link
Collaborator

dakinggg commented Dec 6, 2023

use_cache is something that you can specify at model load time as a kwarg, or at generation time as a kwarg. We can probably make some adjustments to make this more automatic. Thanks!

@timsteuer
Copy link
Author

Thanks for clarifying.

The key argument for having it set to true in the config by default is that generating with use_cache=false results in an exception anyway.

I guess the easiest way would be to adjust the conversion script accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants