cublas runtime error #64

phymhan · 2020-05-18T04:24:34Z

First off, huge thanks to Andy for the PyTorch implementation! However, I encountered a cublas error after a few iterations:
when using pytorch 1.5 with cuda 10.2 on two RTX 8000,
File "BigGAN-PyTorch/layers.py", line 40, in power_iteration
u = torch.matmul(v, W.t())
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)

when using pytorch 1.0.1 with cuda 10.0,
File "BigGAN-PyTorch/layers.py", line 48, in power_iteration
svs += [torch.squeeze(torch.matmul(torch.matmul(v, W.t()), u.t()))]
RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1549636813070/work/aten/src/THC/THCBlas.cu:258

PS: with some modifications the error disappears, for example, using vanilla bce loss instead of hinge loss, or removing the linear layer.

Any idea why this happens? Thanks a lot!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cublas runtime error #64

cublas runtime error #64

phymhan commented May 18, 2020

cublas runtime error #64

cublas runtime error #64

Comments

phymhan commented May 18, 2020