Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning bugs #100

Merged
merged 55 commits into from
Oct 23, 2024
Merged

Finetuning bugs #100

merged 55 commits into from
Oct 23, 2024

Conversation

saattrupdan
Copy link
Collaborator

@saattrupdan saattrupdan commented Oct 23, 2024

This fixes several issues with the finetuning:

  1. Interleaving processed datasets sometimes caused issues, which was fixed by applying dataset processing after the interleaving.
  2. When finetuning Whisper models, use the WhisperConfig.max_length as an upper bound for the model_max_length of the tokenizer. Previously we just had 512 as the upper bound, but some Whisper models require shorter context lengths. This context length is for transcriptions (i.e., labels) only.
  3. Previously in multi-gpu setups, we had to set padding to 'max-length'. I don't recall why this was the case, but this now (for some reason) leads to issues. I've commented out the "forcing it to max-length" block now, along with a TODO comment that this might change in the future.
  4. PyTorch dataloaders seem to try to be clever and splits up the batches for each device in a multi-gpu setup. But since we're already handling the splitting, we need to set dispatch_batches=False.

@saattrupdan saattrupdan self-assigned this Oct 23, 2024
@saattrupdan saattrupdan changed the title Fix/whisper finetuning Finetuning bugs Oct 23, 2024
@saattrupdan saattrupdan merged commit 559354b into main Oct 23, 2024
4 checks passed
@saattrupdan saattrupdan deleted the fix/whisper-finetuning branch October 23, 2024 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant