Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip-gram的数据来源 #3

Open
yan624 opened this issue May 28, 2019 · 0 comments
Open

skip-gram的数据来源 #3

yan624 opened this issue May 28, 2019 · 0 comments

Comments

@yan624
Copy link

yan624 commented May 28, 2019

你好,我想问一下,skip gram项目下的数据是来自哪里?我看了一下里面一个标点符号都没有。而且没有段落,感觉是被人处理过的。正常情况做词向量的训练应该是一个自然段一个自然段的数据吧,不会像这样直接给出1亿个单词的文件。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant