You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm relatively new to t2t and was studying leveraging it for ASR when I came across your work.
Amazing work done @mohitsshah with proper explanation over at16k. The results are pretty impressive.
I'm planning to extend the model for domain-specific approach with an overview of extending the vocab.
Would like your assist onto the following.
As I could drill down, the problem registered through class At16kSubword which extends from class asr, is done similarly like class say LibriSpeech which inherits from class SpeechRecognitionProblem.
The class At16kSubword has property of multiprocess_generate set as true this certainly means the data is being generated as several multiple processes. What was your config over this and depending upon the hours of data, what was time-spent?
Also the core generate_data and generator functions aren't defined, what was the data you build it? Did you leveraged the librispeech and added you'rs own data.
Those two function definition would be required to keep in sync with the additional data I'll be fine-tuning on. Could you provide that?
The approx_vocab_size is defined as 1000 only?
If our goal is to extend the vocab, and we are utilising the existing vocab which is getting used in feature_encoders(), will the new sub-words be added to this or a new vocab with additional sub-words shall be created, because as far as i know the vocab is generated in data-gen phase ?
How much should the approx_vocab_size be tweaked?
Can you provide the data generation command that you used that would be containing the additional FLAGS.
The training command used with the additional FLAGS could be provided?
The text was updated successfully, but these errors were encountered:
I have a similar question for @mohitsshah:
I am trying to fine-tune your model using a new dataset for example Librispeech. However, when I try to to generate Librispeech data and continue training with your provided weights, obtained results are completely wrong and doesn't make sense. I am using the following script to continue training the model:
I'm relatively new to t2t and was studying leveraging it for ASR when I came across your work.
Amazing work done @mohitsshah with proper explanation over at16k. The results are pretty impressive.
I'm planning to extend the model for domain-specific approach with an overview of extending the vocab.
Would like your assist onto the following.
The class At16kSubword has property of multiprocess_generate set as true this certainly means the data is being generated as several multiple processes. What was your config over this and depending upon the hours of data, what was time-spent?
Also the core generate_data and generator functions aren't defined, what was the data you build it? Did you leveraged the librispeech and added you'rs own data.
Those two function definition would be required to keep in sync with the additional data I'll be fine-tuning on. Could you provide that?
The approx_vocab_size is defined as 1000 only?
If our goal is to extend the vocab, and we are utilising the existing vocab which is getting used in feature_encoders(), will the new sub-words be added to this or a new vocab with additional sub-words shall be created, because as far as i know the vocab is generated in data-gen phase ?
Can you provide the data generation command that you used that would be containing the additional FLAGS.
The training command used with the additional FLAGS could be provided?
The text was updated successfully, but these errors were encountered: