It does not work with Falcon-40B correctly #100

AGrosserHH · 2023-06-23T12:20:38Z

When using Falcon-40B with 'bloom-accelerate-inference.py' I am getting first the error that
"ValueError: The following model_kwargs are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)"

After some changes I got it running so that in the function generate() there is now

input_tokens = tokenizer.batch_encode_plus(batch_text_or_text_pairs = inputs,
return_tensors="pt",
padding=False,
return_token_type_ids=False)

where previously it was

input_tokens = tokenizer.batch_encode_plus(inputs, return_tensors="pt", padding=True)

But now it generates always the same text:

"in=DeepSpeed is a machine learning framework
out=DeepSpeed is a machine learning framework"

any idea why it is doing that?

Here is my changed generate function:
def generate():
input_tokens = tokenizer.batch_encode_plus(batch_text_or_text_pairs = inputs,
return_tensors="pt",
padding=False,
return_token_type_ids=False)
for t in input_tokens:
if torch.is_tensor(input_tokens[t]):
input_tokens[t] = input_tokens[t].to("cuda:0")

outputs = model.generate(**input_tokens, **generate_kwargs )

input_tokens_lengths = [x.shape[0] for x in input_tokens.input_ids]
output_tokens_lengths = [x.shape[0] for x in outputs]

total_new_tokens = [o - i for i, o in zip(input_tokens_lengths, output_tokens_lengths)]
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(outputs)

return zip(inputs, outputs, total_new_tokens)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

It does not work with Falcon-40B correctly #100

It does not work with Falcon-40B correctly #100

AGrosserHH commented Jun 23, 2023

It does not work with Falcon-40B correctly #100

It does not work with Falcon-40B correctly #100

Comments

AGrosserHH commented Jun 23, 2023