ValueError: mmap length is greater than file size (trying to train nnUNet using 3200 brain MRIs) #1102
ArmanAvesta
started this conversation in
General
Replies: 1 comment
-
Hi it seems like something went wrong with unpacking the data (did you start multiple folds at once without waiting for the unpacking to be completed? Or was a previous training aborted during the unpacking stage?). You can fix this by deleting all the .npy files (not npz!) in your preprocessing output directory. Then try again |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi!
I want to train the nnUNet to segment the right hippocampus on T1-weighted images. I'm trying the nnUNet on 3200 brain MRIs. I'm done with "nnUNet_plan_and_preprocess -t 500 --verify_dataset_integrity", and now trying to train the 3D nnUNet by "nnUNet_train 3d_fullres nnUNetTrainerV2 Task500_rhipp 0 --npz". When I run it, I get this error:
Please cite the following paper when using nnUNet:
Isensee, F., Jaeger, P.F., Kohl, S.A.A. et al. "nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation." Nat Methods (2020). https://doi.org/10.1038/s41592-020-01008-z
If you have questions or suggestions, feel free to open an issue at https://github.com/MIC-DKFZ/nnUNet
###############################################
I am running the following nnUNet: 3d_fullres
My trainer class is: <class 'nnunet.training.network_training.nnUNetTrainerV2.nnUNetTrainerV2'>
For that I will be using the following configuration:
num_classes: 1
modalities: {0: 'T1'}
use_mask_for_norm OrderedDict([(0, False)])
keep_only_largest_region None
min_region_size_per_class None
min_size_per_class None
normalization_schemes OrderedDict([(0, 'nonCT')])
stages...
stage: 0
{'batch_size': 2, 'num_pool_per_axis': [5, 5, 5], 'patch_size': array([128, 128, 128]), 'median_patient_size_in_voxels': array([139, 164, 135]), 'current_spacing': array([1., 1., 1.]), 'original_spacing': array([1., 1., 1.]), 'do_dummy_2D_data_aug': False, 'pool_op_kernel_sizes': [[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2]], 'conv_kernel_sizes': [[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]]}
I am using stage 0 from these plans
I am using sample dice + CE loss
I am using data from this folder: /home/arman_avesta/nnUNet/data/nnUNet_preprocessed/Task500_rhipp/nnUNetData_plans_v2.1
###############################################
loading dataset
2022-07-06 15:59:26.557215: Using splits from existing split file: /home/arman_avesta/nnUNet/data/nnUNet_preprocessed/Task500_rhipp/splits_final.pkl
2022-07-06 15:59:26.557600: The split file contains 5 splits.
2022-07-06 15:59:26.557669: Desired fold for training: 0
2022-07-06 15:59:26.557711: This split has 2559 training and 640 validation cases.
unpacking dataset
done
2022-07-06 15:59:29.775778: lr: 0.01
Exception in background worker 5:
mmap length is greater than file size
Traceback (most recent call last):
File "/home/arman_avesta/.local/lib/python3.9/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 46, in producer
item = next(data_loader)
File "/home/arman_avesta/.local/lib/python3.9/site-packages/batchgenerators/dataloading/data_loader.py", line 126, in next
return self.generate_train_batch()
File "/home/arman_avesta/nnUNet/nnunet/training/dataloading/dataset_loading.py", line 245, in generate_train_batch
case_all_data = np.load(self._data[i]['data_file'][:-4] + ".npy", self.memmap_mode)
File "/home/arman_avesta/.local/lib/python3.9/site-packages/numpy/lib/npyio.py", line 428, in load
return format.open_memmap(file, mode=mmap_mode)
File "/home/arman_avesta/.local/lib/python3.9/site-packages/numpy/lib/format.py", line 886, in open_memmap
marray = numpy.memmap(filename, dtype=dtype, shape=shape, order=order,
File "/home/arman_avesta/.local/lib/python3.9/site-packages/numpy/core/memmap.py", line 267, in new
mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
ValueError: mmap length is greater than file size
using pin_memory on device 0
Traceback (most recent call last):
File "/home/arman_avesta/.local/bin/nnUNet_train", line 33, in
sys.exit(load_entry_point('nnunet', 'console_scripts', 'nnUNet_train')())
File "/home/arman_avesta/nnUNet/nnunet/run/run_training.py", line 179, in main
trainer.run_training()
File "/home/arman_avesta/nnUNet/nnunet/training/network_training/nnUNetTrainerV2.py", line 440, in run_training
ret = super().run_training()
File "/home/arman_avesta/nnUNet/nnunet/training/network_training/nnUNetTrainer.py", line 317, in run_training
super(nnUNetTrainer, self).run_training()
File "/home/arman_avesta/nnUNet/nnunet/training/network_training/network_trainer.py", line 418, in run_training
_ = self.tr_gen.next()
File "/home/arman_avesta/.local/lib/python3.9/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 182, in next
return self.next()
File "/home/arman_avesta/.local/lib/python3.9/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 206, in next
item = self.__get_next_item()
File "/home/arman_avesta/.local/lib/python3.9/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 190, in __get_next_item
raise RuntimeError("MultiThreadedAugmenter.abort_event was set, something went wrong. Maybe one of "
RuntimeError: MultiThreadedAugmenter.abort_event was set, something went wrong. Maybe one of your workers crashed. This is not the actual error message! Look further up your stdout to see what caused the error. Please also check whether your RAM was full
I'm running the training on a g4dn.xlarge AWS instance. Any help would be greatly appreciated!
Arman
Beta Was this translation helpful? Give feedback.
All reactions