Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

language model一些问题 #11

Open
hmwang97414 opened this issue Mar 5, 2022 · 1 comment
Open

language model一些问题 #11

hmwang97414 opened this issue Mar 5, 2022 · 1 comment

Comments

@hmwang97414
Copy link

你好,看了之前的issue,您说一个训练集是1k原始数据,另一个训练集是1k*n+augment混合的结果
我有一个疑问,就是在gengrate.py中,指定了每次生成的num_sentence ,那么如何确定augment的数据与原始数据的比例呢,比如原始训练集是1k,需要augment多少数据进行混合,才算比较合理呢

@Bosheng2020
Copy link

您好 谢谢您的问题。在我们的paper里面有提到oversample ratio的讨论 您可以参考一下。也可以根据自己的实际情况跑跑实验。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants