-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions on training details and enhancement process #19
Comments
Good question! I also want to know it. |
Hi @TianyuCao thank you for your interest! The training was interrupted using the early stopping criterion in the code, we did the experiments as in the current version of the code. I recall early stopping was most of the time raised after 200-300 epochs. Batch size was 8 * 4 GPUs = effective batch size of 32. W.r.t your reported results: could you share the figures you obtained, as well as the parameters used for inference please? (and also which results you compare to in the paper, i.e. which line in which table). Thank you |
I am unable to have a general research discussion at the moment, but invite you to ask any question related to this repository please (i.e. code issues, unexpected behaviour) by raising an issue |
Hi,
Thanks for your great work. I have several questions and hope you can clarify it.
For the Storm model in paper, what is the batch size (I saw 8 by default in the codes)? How many epochs did it train (I also saw the earlystopping setting in codes, but I wonder whether it was trained until the max of 1000 epochs, or stopped by earlystopping after 50 patience)?
Besides, I saw "For training, sequences of 256 STFT frames (≈2s) are randomly extracted from the full-length utterances". In this case, when it comes to enhancement, does it segment the whole input into several frames (2s), enhance each frame, and finally concatenate them as the output? Or enhance the whole utterance at the same time?
Also, I just generated the data based on your codes and use the WSJ0+Chime3 checkpoint to denoise the data. However, the pretrained checkpoint has lower results than article results. I wonder whether the default parameters in the codes are exactly the same as what was used to obtain the results in the paper for both data generation (create_data.py) as well as model training.
Sorry for so many questions. Thanks for your clarifications in advance.
The text was updated successfully, but these errors were encountered: