Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The code doesn't run due to AttributeError: '_IterDataPipeSerializationWrapper' object has no attribute 'datapipe #10

Open
debajyotimaz opened this issue Jan 11, 2024 · 0 comments

Comments

@debajyotimaz
Copy link

When I run the code via:
python train.py --task configs/toxicity/pretrain.yml --method configs/toxicity/mle.yml

I am getting this error:

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Using pad_token, but it is not set yet.
setting gradient_accumulation_steps=8 based on effective_batch_size=64 and instantaneous_bsz=8 (world_size=1, n_gpu=1)
setting max_steps=50354 based on num_tokens=3.30e+09 and tokens_already_seen=0.00e+00
max_steps is given, it will override any value given in num_train_epochs
Setting train_dataloader.batch_size=8
Using amp half precision backend
/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  warnings.warn(
***** Running training *****
  Num examples = 3222656
  Num Epochs = 9223372036854775807
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 64
  Gradient Accumulation steps = 8
  Total optimization steps = 50354
Traceback (most recent call last):
  File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/train.py", line 153, in <module>
    train(args.checkpoint_path, config=config)
  File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/train.py", line 129, in train
    trainer.train(resume_from_checkpoint=checkpoint_path)
  File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer.py", line 1343, in train
    self.control = self.callback_handler.on_train_begin(args, self.state, self.control)
  File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer_callback.py", line 347, in on_train_begin
    return self.call_event("on_train_begin", args, state, control)
  File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/transformers/trainer_callback.py", line 388, in call_event
    result = getattr(callback, event)(
  File "/data1/debajyoti/test/pre_human_feedback/pretraining-with-human-feedback/apo/callbacks.py", line 135, in on_train_begin
    tokens_already_seen = kwargs.get('train_dataloader').dataset.datapipe.skip_tokens
  File "/data1/debajyoti/test/pre_human_feedback/env/lib/python3.9/site-packages/torch/utils/data/datapipes/datapipe.py", line 129, in __getattr__
    raise AttributeError(f"'{self.__class__.__name__}' object has no attribute '{attribute_name}")
AttributeError: '_IterDataPipeSerializationWrapper' object has no attribute 'datapipe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant