You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I run the code via: python train.py --task configs/toxicity/pretrain.yml --method configs/toxicity/mle.yml
I am getting this error:
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Using pad_token, but it is not set yet.
setting gradient_accumulation_steps=8 based on effective_batch_size=64 and instantaneous_bsz=8 (world_size=1, n_gpu=1)
setting max_steps=50354 based on num_tokens=3.30e+09 and tokens_already_seen=0.00e+00
max_steps is given, it will override any value given in num_train_epochs
Setting train_dataloader.batch_size=8
Using amp half precision backend
/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
***** Running training *****
Num examples = 3222656
Num Epochs = 9223372036854775807
Instantaneous batch size per device = 8
Total train batch size (w. parallel, distributed & accumulation) = 64
Gradient Accumulation steps = 8
Total optimization steps = 50354
Traceback (most recent call last):
File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/train.py", line 153, in <module>
train(args.checkpoint_path, config=config)
File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/train.py", line 129, in train
trainer.train(resume_from_checkpoint=checkpoint_path)
File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer.py", line 1343, in train
self.control = self.callback_handler.on_train_begin(args, self.state, self.control)
File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer_callback.py", line 347, in on_train_begin
return self.call_event("on_train_begin", args, state, control)
File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer_callback.py", line 388, in call_event
result = getattr(callback, event)(
File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/apo/callbacks.py", line 135, in on_train_begin
tokens_already_seen = kwargs.get('train_dataloader').dataset.datapipe.skip_tokens
File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/torch/utils/data/datapipes/datapipe.py", line 129, in __getattr__
raise AttributeError(f"'{self.__class__.__name__}' object has no attribute '{attribute_name}")
AttributeError: '_IterDataPipeSerializationWrapper' object has no attribute 'datapipe
The text was updated successfully, but these errors were encountered:
When I run the code via:
python train.py --task configs/toxicity/pretrain.yml --method configs/toxicity/mle.yml
I am getting this error:
The text was updated successfully, but these errors were encountered: