The code doesn't run due to AttributeError: '_IterDataPipeSerializationWrapper' object has no attribute 'datapipe #10

debajyotimaz · 2024-01-11T11:47:02Z

When I run the code via:
python train.py --task configs/toxicity/pretrain.yml --method configs/toxicity/mle.yml

I am getting this error:

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Using pad_token, but it is not set yet.
setting gradient_accumulation_steps=8 based on effective_batch_size=64 and instantaneous_bsz=8 (world_size=1, n_gpu=1)
setting max_steps=50354 based on num_tokens=3.30e+09 and tokens_already_seen=0.00e+00
max_steps is given, it will override any value given in num_train_epochs
Setting train_dataloader.batch_size=8
Using amp half precision backend
/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  warnings.warn(
***** Running training *****
  Num examples = 3222656
  Num Epochs = 9223372036854775807
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 64
  Gradient Accumulation steps = 8
  Total optimization steps = 50354
Traceback (most recent call last):
  File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/train.py", line 153, in <module>
    train(args.checkpoint_path, config=config)
  File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/train.py", line 129, in train
    trainer.train(resume_from_checkpoint=checkpoint_path)
  File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer.py", line 1343, in train
    self.control = self.callback_handler.on_train_begin(args, self.state, self.control)
  File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer_callback.py", line 347, in on_train_begin
    return self.call_event("on_train_begin", args, state, control)
  File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer_callback.py", line 388, in call_event
    result = getattr(callback, event)(
  File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/apo/callbacks.py", line 135, in on_train_begin
    tokens_already_seen = kwargs.get('train_dataloader').dataset.datapipe.skip_tokens
  File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/torch/utils/data/datapipes/datapipe.py", line 129, in __getattr__
    raise AttributeError(f"'{self.__class__.__name__}' object has no attribute '{attribute_name}")
AttributeError: '_IterDataPipeSerializationWrapper' object has no attribute 'datapipe

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The code doesn't run due to AttributeError: '_IterDataPipeSerializationWrapper' object has no attribute 'datapipe #10

The code doesn't run due to AttributeError: '_IterDataPipeSerializationWrapper' object has no attribute 'datapipe #10

debajyotimaz commented Jan 11, 2024

The code doesn't run due to AttributeError: '_IterDataPipeSerializationWrapper' object has no attribute 'datapipe #10

The code doesn't run due to AttributeError: '_IterDataPipeSerializationWrapper' object has no attribute 'datapipe #10

Comments

debajyotimaz commented Jan 11, 2024