Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] 运行快速开始中的例子python scripts/train.py -c examples/bert_crf/configs/resume.yaml出现An error occurred while generating the dataset #47

Open
1 task done
Gsq6161 opened this issue Sep 27, 2024 · 0 comments
Labels
question Further information is requested

Comments

@Gsq6161
Copy link

Gsq6161 commented Sep 27, 2024

What is your question?

我是一名刚开始学习的小白,本地部署adaseq,跟着仓库中的流程走的,在 except Exception as e:
# Ignore the writer's error for no examples written to the file if this error was caused by the error in _generate_examples before the first example was yielded
if isinstance(e, SchemaInferenceError) and e.context is not None:
e = e.context
raise DatasetGenerationError("An error occurred while generating the dataset") from e执行不通了,该如何解决呢

What have you tried?

降低torch版本、datasets版本均不管用

Code (if necessary)

(adaseq) PS C:\Users\Acer\Desktop\AdaSeq-master> python scripts/train.py -c examples/bert_crf/configs/resume.yaml
2024-09-27 21:32:46,554 - modelscope - WARNING - The reference has been Deprecated in modelscope v1.4.0+, please use from modelscope.msdatasets.dataset_cls.custom_datasets import TorchCustomDataset
2024-09-27 21:32:47,201 - INFO - adaseq.data.dataset_manager - Will use a custom loading script: E:\Anaconda\envs\adaseq\lib\site-packages\adaseq\data\dataset_builders\named_entity_recognition_dataset_builder.py
Downloading data: 135kB [00:00, 2.86MB/s]
Downloading data: 1.09MB [00:00, 10.4MB/s]
Downloading data: 120kB [00:00, 2.56MB/s]
Generating test split: 0 examples [00:00, ? examples/s]
Traceback (most recent call last):
File "E:\Anaconda\envs\adaseq\lib\site-packages\datasets\builder.py", line 1739, in _prepare_split_single
writer = writer_class(
File "E:\Anaconda\envs\adaseq\lib\site-packages\datasets\arrow_writer.py", line 338, in init
self.stream = self._fs.open(path, "wb")
File "E:\Anaconda\envs\adaseq\lib\site-packages\fsspec\spec.py", line 1303, in open
f = self._open(
File "E:\Anaconda\envs\adaseq\lib\site-packages\fsspec\implementations\local.py", line 191, in _open
return LocalFileOpener(path, mode, fs=self, **kwargs)
File "E:\Anaconda\envs\adaseq\lib\site-packages\fsspec\implementations\local.py", line 355, in init
self._open()
File "E:\Anaconda\envs\adaseq\lib\site-packages\fsspec\implementations\local.py", line 360, in _open
self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/Acer/.cache/huggingface/datasets/named_entity_recognition_dataset_builder/default-84b1c02799fb57ba/0.0.0/db737b9bb893f20fb03d04403a30bf7c033256c212b7e9f0ebc6e9c95
8535c51.incomplete/named_entity_recognition_dataset_builder-test-00000-00000-of-NNNNN.arrow'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\Acer\Desktop\AdaSeq-master\scripts\train.py", line 39, in
train_model_from_args(args)
File "E:\Anaconda\envs\adaseq\lib\site-packages\adaseq\commands\train.py", line 84, in train_model_from_args
train_model(
File "E:\Anaconda\envs\adaseq\lib\site-packages\adaseq\commands\train.py", line 156, in train_model
trainer = build_trainer_from_partial_objects(
File "E:\Anaconda\envs\adaseq\lib\site-packages\adaseq\commands\train.py", line 185, in build_trainer_from_partial_objects
dm = DatasetManager.from_config(task=config.task, **config.dataset)
File "E:\Anaconda\envs\adaseq\lib\site-packages\adaseq\data\dataset_manager.py", line 182, in from_config
hfdataset = hf_load_dataset(path, name=name, **kwargs)
File "E:\Anaconda\envs\adaseq\lib\site-packages\datasets\load.py", line 2628, in load_dataset
builder_instance.download_and_prepare(
File "E:\Anaconda\envs\adaseq\lib\site-packages\datasets\builder.py", line 1029, in download_and_prepare
self._download_and_prepare(
File "E:\Anaconda\envs\adaseq\lib\site-packages\datasets\builder.py", line 1791, in _download_and_prepare
super()._download_and_prepare(
File "E:\Anaconda\envs\adaseq\lib\site-packages\datasets\builder.py", line 1124, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "E:\Anaconda\envs\adaseq\lib\site-packages\datasets\builder.py", line 1629, in _prepare_split
for job_id, done, content in self._prepare_split_single(
File "E:\Anaconda\envs\adaseq\lib\site-packages\datasets\builder.py", line 1786, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.exceptions.DatasetGenerationError: An error occurred while generating the dataset

What's your environment?

  • AdaSeq Version (e.g., 1.0 or master):0.6.6
  • ModelScope Version (e.g., 1.0 or master):1.18.1
  • PyTorch Version (e.g., 1.12.1):1.12.1和1.9.0都试过
  • OS (e.g., Ubuntu 20.04):windows10
  • Python version:3.9
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • Any other relevant information:

Code of Conduct

  • I agree to follow this project's Code of Conduct
@Gsq6161 Gsq6161 added the question Further information is requested label Sep 27, 2024
@Gsq6161 Gsq6161 changed the title [Question] 运行python scripts/train.py -c examples/bert_crf/configs/resume.yaml出现An error occurred while generating the dataset [Question] 运行快速开始中的例子python scripts/train.py -c examples/bert_crf/configs/resume.yaml出现An error occurred while generating the dataset Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant