Generating vmap for en->many model #5

santha96 · 2022-04-27T10:27:38Z

Hi,
vmap is useful to reduce inference time significantly. Able to generate vmap for many to one model and its works fine. How does vmap work for one to many models?

guillaumekln · 2022-04-28T09:38:12Z

Hi,

It will work similarly, but the list of candidates for a given source sentence will include tokens/words from multiple languages.

santha96 · 2022-04-28T11:43:05Z

Hi,
Generated vmap using the below command.
python build-vmap.py -pt phrase-table -ms 3 -mf 2 -km 20 -tv target_vocabulary -zg zg_list > vmap

Enabling vmap in one to many directions in ctranslate2 leads to a bleu score drop of 2-3 points per language. Also when I looked inside generated vmap, the source tokens followed by the supervision language tag capture more meaning in the corresponding language due to the presence of tags but other source tokens which is far away from the language tag either capture meaning from a few languages or it seems to be insufficient coverage due to many languages. will increasing keep meaning(-km) parameter help or is there any better way to do it?.can you pls suggest it?

guillaumekln · 2022-05-09T13:25:12Z

Indeed the current approach may not work well for one to many data. I can't think of a parameter that can fully resolve your issue. It looks like a solution would be to have one vmap per target language? The inference code could then select the appropriate vmap based on the language token.

santha96 · 2022-05-09T13:45:40Z

Thanks, @guillaumekln .do we have such support in ctranslate2?

guillaumekln · 2022-05-09T13:46:56Z

No, this logic is not implemented. It is only an idea.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating vmap for en->many model #5

Generating vmap for en->many model #5

santha96 commented Apr 27, 2022 •

edited

Loading

guillaumekln commented Apr 28, 2022

santha96 commented Apr 28, 2022

guillaumekln commented May 9, 2022

santha96 commented May 9, 2022

guillaumekln commented May 9, 2022

Generating vmap for en->many model #5

Generating vmap for en->many model #5

Comments

santha96 commented Apr 27, 2022 • edited Loading

guillaumekln commented Apr 28, 2022

santha96 commented Apr 28, 2022

guillaumekln commented May 9, 2022

santha96 commented May 9, 2022

guillaumekln commented May 9, 2022

santha96 commented Apr 27, 2022 •

edited

Loading