Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference translation result after convert to ctranslate #1781

Open
hieunguyenquoc opened this issue Sep 17, 2024 · 2 comments
Open

Difference translation result after convert to ctranslate #1781

hieunguyenquoc opened this issue Sep 17, 2024 · 2 comments

Comments

@hieunguyenquoc
Copy link

hieunguyenquoc commented Sep 17, 2024

Hi. I have finetune the Helsinki-NLP/opus-mt-zh-vi model for translating Chinese to Vietnamese. When I convert the model to ctranslate2, the performance is decrease (from 32 sacrebleu with transformer inference to just 28 sacrebleu with ctranslate2 inference). Can anyone explain for me why ? Thank you. Here is my code :

Converted code : ct2-transformers-converter --model /home/hieunq/Documents/VTP/chinsese_translation/train_chinese_vietnamsese_translation/finetune_helsinky_zh_vi/model/checkpoint-36625 --output_dir zh-vi-ct2 --force --copy_files generation_config.json tokenizer_config.json vocab.json source.spm target.spm

Inference code :
import ctranslate2
import transformers
import time
import torch
import evaluate

metric = evaluate.load("sacrebleu")

device = "cuda" if torch.cuda.is_available() else "cpu"
translator = ctranslate2.Translator("/home/hieunq/Documents/VTP/chinsese_translation/train_chinese_vietnamsese_translation/finetune_helsinky_zh_vi/zh-vi-ct2", device=device, compute_type="auto")
tokenizer = transformers.AutoTokenizer.from_pretrained("/home/hieunq/Documents/VTP/chinsese_translation/train_chinese_vietnamsese_translation/finetune_helsinky_zh_vi/zh-vi-ct2")

f_zh = open("/home/hieunq/Documents/VTP/chinsese_translation/data_version_1_and_2/zh/test_zh/test_zh_version_2_data_Thời_trang_nữ.txt","r")
f_vi = open("/home/hieunq/Documents/VTP/chinsese_translation/data_version_1_and_2/vi/test_vi/test_vi_version_2_data_Thời_trang_nữ.txt","r",encoding="utf-8")
texts = f_zh.readlines()

translated_texts = []
start = time.time()
batch_source_tokens = [tokenizer.convert_ids_to_tokens(tokenizer.encode(sentence)) for sentence in texts]

batch_size = 10
results = translator.translate_batch(batch_source_tokens, max_batch_size = batch_size, beam_size = 4)

for i, result in enumerate(results):
target = result.hypotheses[0] # Giả sử chúng ta lấy hypothesis tốt nhất
translated_sentence = tokenizer.decode(tokenizer.convert_tokens_to_ids(target))
translated_texts.append(translated_sentence)

references = f_vi.readlines()

predictions_texts = [pred.strip() for pred in translated_texts]
references_text = [pred.strip() for pred in references]

result = metric.compute(predictions=predictions_texts, references=references_text)
print(result["score"])
print("Time :", time.time() - start)

@minhthuc2502
Copy link
Collaborator

Different frameworks may have slightly varied implementations of backend operations, so small differences in scores are expected. You might also want to test with CTranslate2 3.x to see if it brings any improvements.

@hieunguyenquoc
Copy link
Author

@minhthuc2502 Hi. Thanks for your response. I have tried your suggestion. But it still have the same result. It there anyway so I can remain the quality of the ctranlate2 model ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants