-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generating vmap for en->many model #5
Comments
Hi, It will work similarly, but the list of candidates for a given source sentence will include tokens/words from multiple languages. |
Hi, Enabling vmap in one to many directions in ctranslate2 leads to a bleu score drop of 2-3 points per language. Also when I looked inside generated vmap, the source tokens followed by the supervision language tag capture more meaning in the corresponding language due to the presence of tags but other source tokens which is far away from the language tag either capture meaning from a few languages or it seems to be insufficient coverage due to many languages. will increasing keep meaning(-km) parameter help or is there any better way to do it?.can you pls suggest it? |
Indeed the current approach may not work well for one to many data. I can't think of a parameter that can fully resolve your issue. It looks like a solution would be to have one vmap per target language? The inference code could then select the appropriate vmap based on the language token. |
Thanks, @guillaumekln .do we have such support in ctranslate2? |
No, this logic is not implemented. It is only an idea. |
Hi,
vmap is useful to reduce inference time significantly. Able to generate vmap for many to one model and its works fine. How does vmap work for one to many models?
The text was updated successfully, but these errors were encountered: