Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Prompt Template] Silent bug - Performance Killer #27

Open
timothylimyl opened this issue Jan 5, 2024 · 0 comments
Open

[Prompt Template] Silent bug - Performance Killer #27

timothylimyl opened this issue Jan 5, 2024 · 0 comments

Comments

@timothylimyl
Copy link

Hi,

I found that the prompt generated from the dataset (ex: MMLU) is not wrapped according to the model's prompt template. The performance you'll get out of the model will be degraded. If you look into the mmlu.py script, you will see that your prompt that the model runs on is a prompt that has not gone through the template required by the model.

def evaluate(args, subject, model: EvalModel, dev_df, test_df):
.
.
.
        prompt = prompt_input_template(prompt, model) # personally added
        pred = model.run(prompt)
.
.
.
.

-> modeling.py

    def run(self, prompt: str, **kwargs) -> str:
        self.load()
        inputs = self.tokenizer(prompt, return_tensors="pt").to(self.device)
        if "RWForCausalLM" in str(type(self.model)):
            inputs.pop("token_type_ids")  # Not used by Falcon model

        outputs = self.model.generate(
            **inputs,
            max_new_tokens=self.max_output_length,
            pad_token_id=self.tokenizer.eos_token_id,  # Avoid pad token warning
            **kwargs,
        )
        batch_size, length = inputs.input_ids.shape
        return self.tokenizer.decode(outputs[0, length:], skip_special_tokens=True)

I personally added a prompt_input_template() function to solve this issue. You'll still get correct outputs from time to time which is why this is a silent bug (common ML issue).

It will be hard for you to solve this generically for every model as the problem is hugging face recently only realised this problem and added prompt template in the tokenizer (ref: https://huggingface.co/docs/transformers/chat_templating). Thus, once the open-source community adapts this (which I think it will be the eventual case), you can use apply_template to solve this issue. For now, you can add individual mappings to templatized the prompts accordingly. For base pretrained models or API calls, this is not an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant