Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何获取训练数据? #26

Open
APiaoG opened this issue Feb 27, 2024 · 1 comment
Open

如何获取训练数据? #26

APiaoG opened this issue Feb 27, 2024 · 1 comment

Comments

@APiaoG
Copy link

APiaoG commented Feb 27, 2024

您好!非常感谢您的杰出的开源工作!我想问一下以下训练数据可否可以开源呢?
data_dir:
- dataset/seed_v2_0828/caption/unsplash_cc3m
- dataset/seed_v2_0828/caption/coco
data_dir: /dataset/seed_v2_0828/caption/laion-coco
data_dir: dataset/seed_v2_0828/image_interleaved/mmc4
data_dir: dataset/seed_v2_0828/image_interleaved/obelisc
data_dir: dataset/seed_v2_0828/caption/WebVid-10m
data_dir: dataset/wikipedia_20220301.en
或者是经过src/tools/extract_image_ids_to_torchdata_parallel.py 预处理之前的数据集可否提供一下呢?非常感谢!

@geyuying
Copy link
Collaborator

由于这些数据的版权不归我们所有,所以我们无法提供下载好的数据集,可以去相应的官网下载这些公开数据集。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants