Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

预处理阶段无法正确生成dataset.pt #104

Open
Ailurus9527 opened this issue Jan 2, 2025 · 0 comments
Open

预处理阶段无法正确生成dataset.pt #104

Ailurus9527 opened this issue Jan 2, 2025 · 0 comments

Comments

@Ailurus9527
Copy link

作者您好!我按照readme里Reproduce ET-BERT部分运行以下代码时
python3 preprocess.py --corpus_path corpora/encrypted_traffic_burst.txt
--vocab_path models/encryptd_vocab.txt
--dataset_path dataset.pt --processes_num 8 --target bert
出现了dataset-tmp-0.pt一直为0的错误,dataset-tmp-1至7是正常的,并且终端一直显示以下内容:
示例
后台可以看到内存(120GB)占用先高后低,最后维持在几GB的水平,但是运行24h仍不能得到dataset.pt,且dataset-tmp-0.pt一直为空
请问这种问题如何解决呢?感谢您的回答

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant