Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] About COMET Train and Test Data #241

Open
moore3930 opened this issue Jan 6, 2025 · 0 comments
Open

[QUESTION] About COMET Train and Test Data #241

moore3930 opened this issue Jan 6, 2025 · 0 comments
Labels
question Further information is requested

Comments

@moore3930
Copy link

moore3930 commented Jan 6, 2025

Hi, I have some questions about the dataset provided here: https://github.com/Unbabel/COMET/tree/master/data

  1. If I understand correctly, the training data for each year's WMT is accumulated from all previous WMTs. Does this mean that here (https://github.com/Unbabel/COMET/tree/master/data), DA data for 2021 is a subset of 2022?

  2. The training data here (https://github.com/Unbabel/COMET/blob/master/configs/models/referenceless_model.yaml) is set to data/1720-da.csv. Does this mean that just merge all DA data here (https://github.com/Unbabel/COMET/tree/master/data) from 2017 to 2020? Are there any duplication issues?

  3. Where can I find the test data if I want to formally test my metric on the WMT21 DA Task?

@moore3930 moore3930 added the question Further information is requested label Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant