Skip to content
This repository has been archived by the owner on Oct 9, 2024. It is now read-only.

It does not work with Falcon-40B correctly #100

Open
AGrosserHH opened this issue Jun 23, 2023 · 0 comments
Open

It does not work with Falcon-40B correctly #100

AGrosserHH opened this issue Jun 23, 2023 · 0 comments

Comments

@AGrosserHH
Copy link

When using Falcon-40B with 'bloom-accelerate-inference.py' I am getting first the error that
"ValueError: The following model_kwargs are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)"

After some changes I got it running so that in the function generate() there is now

input_tokens = tokenizer.batch_encode_plus(batch_text_or_text_pairs = inputs,
return_tensors="pt",
padding=False,
return_token_type_ids=False)

where previously it was

input_tokens = tokenizer.batch_encode_plus(inputs, return_tensors="pt", padding=True)

But now it generates always the same text:

"in=DeepSpeed is a machine learning framework
out=DeepSpeed is a machine learning framework"

any idea why it is doing that?

Here is my changed generate function:
def generate():
input_tokens = tokenizer.batch_encode_plus(batch_text_or_text_pairs = inputs,
return_tensors="pt",
padding=False,
return_token_type_ids=False)
for t in input_tokens:
if torch.is_tensor(input_tokens[t]):
input_tokens[t] = input_tokens[t].to("cuda:0")

outputs = model.generate(**input_tokens, **generate_kwargs )

input_tokens_lengths = [x.shape[0] for x in input_tokens.input_ids]
output_tokens_lengths = [x.shape[0] for x in outputs]

total_new_tokens = [o - i for i, o in zip(input_tokens_lengths, output_tokens_lengths)]
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(outputs)

return zip(inputs, outputs, total_new_tokens)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant