Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练时指定asr为wav2vec时执行task2问题 #165

Open
chenkai89 opened this issue Aug 30, 2024 · 0 comments
Open

训练时指定asr为wav2vec时执行task2问题 #165

chenkai89 opened this issue Aug 30, 2024 · 0 comments

Comments

@chenkai89
Copy link

(vach) H:\AI\Vach\talkers\er_nerf>python data_utils/process.py data/dl/dl.mp4 --asr wav2vec --task 2
[INFO] ===== extract audio labels for data/dl\aud.wav =====
[WARN] audio has 2 channels, only use the first.
[INFO] loaded audio stream data/dl\aud.wav: (4481376,)
[INFO] loading ASR model cpierse/wav2vec2-large-xlsr-53-esperanto...
G:\anaconda3\envs\vach\lib\site-packages\transformers\configuration_utils.py:364: UserWarning: Passing gradient_checkpointing to a config initialization is deprecated and will be removed in v5 Transformers. Using model.gradient_checkpointing_enable() instead, or if you are using the Trainer API, pass gradient_checkpointing=True in your TrainingArguments.
warnings.warn(
G:\anaconda3\envs\vach\lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: clean_up_tokenization_spaces was not set. It will be set to True by default. This behavior will be depracted in transformers v4.45, and will be then set to False by default. For more details check this issue: huggingface/transformers#31884
warnings.warn(
Traceback (most recent call last):
File "H:\AI\Vach\talkers\er_nerf\nerf_triplane\asr.py", line 419, in
asr.run()
File "H:\AI\Vach\talkers\er_nerf\nerf_triplane\asr.py", line 361, in run
self.run_step()
File "H:\AI\Vach\talkers\er_nerf\nerf_triplane\asr.py", line 222, in run_step
self.feat_queue[start:end] = feats
RuntimeError: The expanded size of the tensor (50) must match the existing size (54) at non-singleton dimension 0. Target sizes: [50, 44]. Tensor sizes: [54, 44]
[INFO] ===== extracted audio labels =====

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant