set_transform seem process all the sample on the fly, not batch_size #6050
Replies: 1 comment 1 reply
-
i logging the process, and in the line: train_result = trainer.train(resume_from_checkpoint=checkpoint) , call the prepare_dataset_transform, but input param batch size = 1 , seem not batch; and seem to process all the sample. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
two question with set_transform, ask for help!
Basic info:
datasets 2.13.1
python 3.10
pytorch torch 2.0.1
`def prepare_dataset_transform(batch):
# process audio
sample = batch[audio_column_name]
array_input = [audio["array"] for audio in batch[audio_column_name]]
inputs = feature_extractor(
array_input, sampling_rate=sample[0]["sampling_rate"], return_attention_mask=forward_attention_mask
)
Beta Was this translation helpful? Give feedback.
All reactions