Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

更换数据集后报CUDA error: device-side assert triggered #53

Open
icarryyou opened this issue Nov 5, 2023 · 1 comment
Open

更换数据集后报CUDA error: device-side assert triggered #53

icarryyou opened this issue Nov 5, 2023 · 1 comment

Comments

@icarryyou
Copy link

作者你好,我用我自己的数据集进行训练的时候一直报错,
{0: 'O', 1: 'B-PRE', 2: 'I-PRE', 3: 'B-PAT', 4: 'I-PAT', 5: 'B-DES', 6: 'I-DES', 7: 'B-MED', 8: 'I-MED', 9: 'B-EFF', 10: 'I-EFF', 11: 'B-CAU', 12: 'I-CAU', 13: 'B-SYM', 14: 'I-SYM'}
Namespace(adam_epsilon=1e-08, bert_dir='./model_hub/chinese-bert-wwm-ext/', crf_lr=0.03, data_dir='./data/cner', data_name='cner', dropout=0.3, dropout_prob=0.3, eval_batch_size=32, gpu_ids='0', log_dir='./logs/', lr=3e-05, lstm_hidden=128, max_grad_norm=1, max_seq_len=300, model_name='bert_idcnn_crf', num_layers=1, num_tags=15, other_lr=0.0003, output_dir='./checkpoints/', seed=123, swa_start=3, train_batch_size=32, train_epochs=3, use_crf='True', use_idcnn='True', use_kd='False', use_lstm='False', use_tensorboard='True', warmup_proportion=0.1, weight_decay=0.01)
Use single gpu in: ['0']
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [1,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [3,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [4,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [9,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [10,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [18,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [25,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [26,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: block: [0,0,0], thread: [27,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "E:\py_project\QA_sys\TCM_NER\pytorch_bert_bilstm_crf_ner-main\main.py", line 279, in
bertForNer.train()
File "E:\py_project\QA_sys\TCM_NER\pytorch_bert_bilstm_crf_ner-main\main.py", line 59, in train
loss, logits = self.model(batch_data['token_ids'], batch_data['attention_masks'], batch_data['token_type_ids'], batch_data['labels'])
File "E:\Application\Anaconda\envs\bert\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\py_project\QA_sys\TCM_NER\pytorch_bert_bilstm_crf_ner-main\bert_ner_model.py", line 284, in forward
loss = -self.crf(seq_out, labels, mask=attention_masks, reduction='mean')
File "E:\Application\Anaconda\envs\bert\lib\site-packages\torch\nn\modules\module.py", line 889, in call_impl
result = self.forward(*input, **kwargs)
File "E:\Application\Anaconda\envs\bert\lib\site-packages\torchcrf_init
.py", line 102, in forward
numerator = self.compute_score(emissions, tags, mask)
File "E:\Application\Anaconda\envs\bert\lib\site-packages\torchcrf_init
.py", line 196, in _compute_score
score += emissions[i, torch.arange(batch_size), tags[i]] * mask[i]
RuntimeError: CUDA error: device-side assert triggered

num_tags修改了,我的数据集是BIO标注,raw_data里的process也进行了相应修改,请问可以怎么解决

@taishan1994
Copy link
Owner

标签没有对上。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants