You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All the notebooks are giving a lot of errors. The PPOTrainer class also has a lot of other arguments that are required to be passed. for example, 'processing_class' instead of 'tokenizer', 'policy', 'ref_policy', 'reward_model', 'value_model', 'train_dataset' and NO 'optimizer'. I resolved all of these by mentioning the correct arguments. But now i am stuck at a new error:
Traceback (most recent call last):
File "/u/student/2020/ai20resch11003/miniconda3/envs/rlhf_new/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 782, in convert_to_tensors
tensor = as_tensor(value)
File "/u/student/2020/ai20resch11003/miniconda3/envs/rlhf_new/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 738, in as_tensor
return torch.tensor(value)
ValueError: too many dimensions 'str'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/u/student/2020/ai20resch11003/RLHF_new/hf_example.py", line 235, in <module>
for epoch, batch in tqdm(enumerate(ppo_trainer.dataloader)):
File "/u/student/2020/ai20resch11003/miniconda3/envs/rlhf_new/lib/python3.9/site-packages/tqdm/std.py", line 1181, in __iter__
for obj in iterable:
File "/u/student/2020/ai20resch11003/miniconda3/envs/rlhf_new/lib/python3.9/site-packages/accelerate/data_loader.py", line 552, in __iter__
current_batch = next(dataloader_iter)
File "/u/student/2020/ai20resch11003/miniconda3/envs/rlhf_new/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
data = self._next_data()
File "/u/student/2020/ai20resch11003/miniconda3/envs/rlhf_new/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 673, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/u/student/2020/ai20resch11003/miniconda3/envs/rlhf_new/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
return self.collate_fn(data)
File "/u/student/2020/ai20resch11003/miniconda3/envs/rlhf_new/lib/python3.9/site-packages/transformers/data/data_collator.py", line 271, in __call__
batch = pad_without_fast_tokenizer_warning(
File "/u/student/2020/ai20resch11003/miniconda3/envs/rlhf_new/lib/python3.9/site-packages/transformers/data/data_collator.py", line 66, in pad_without_fast_tokenizer_warning
padded = tokenizer.pad(*pad_args, **pad_kwargs)
File "/u/student/2020/ai20resch11003/miniconda3/envs/rlhf_new/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 3548, in pad
return BatchEncoding(batch_outputs, tensor_type=return_tensors)
File "/u/student/2020/ai20resch11003/miniconda3/envs/rlhf_new/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 240, in __init__
self.convert_to_tensors(tensor_type=tensor_type, prepend_batch_axis=prepend_batch_axis)
File "/u/student/2020/ai20resch11003/miniconda3/envs/rlhf_new/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 798, in convert_to_tensors
raise ValueError(
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`filename` in this case) have excessive nesting (inputs type `list` where type `int` is expected).
if these notebooks are outdated, please mention the correct 'trl' and 'transformers' package versions that are supposed to be installed before using these. that would help a lot.
please help :(
The text was updated successfully, but these errors were encountered:
We know that a lot of notebooks/docs are outdated. Sorry for the inconvenience.
It was a deliberate choice that has allowed us to move faster on the lib evolution. For more information, see #2174 (comment). But you can be sure that it will soon be completely up to date.
Most doc and notebooks should work with trl==0.11
I agree with you that the notebooks should mention it. Feel free to open a PR it that sense if you wan't to contribute
Thank you so much for your prompt reply. Changing the package trl version resolved the errors. I have been trying several code examples of rlhf from huggingface and also from youtube for a week now, and all had multiple issues. Was stuck for so many days. Thanks again..
All the notebooks are giving a lot of errors.
The
PPOTrainer class also has a lot of other arguments that are required to be passed. for example, 'processing_class' instead of 'tokenizer', 'policy', 'ref_policy', 'reward_model', 'value_model', 'train_dataset' and NO 'optimizer'. I resolved all of these by mentioning the correct arguments. But now i am stuck at a new error:I am trying this notebook: https://github.com/huggingface/trl/blob/main/examples/research_projects/toxicity/scripts/gpt-j-6b-toxicity.py
if these notebooks are outdated, please mention the correct 'trl' and 'transformers' package versions that are supposed to be installed before using these. that would help a lot.
please help :(
The text was updated successfully, but these errors were encountered: