Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cublas runtime error #64

Open
phymhan opened this issue May 18, 2020 · 0 comments
Open

cublas runtime error #64

phymhan opened this issue May 18, 2020 · 0 comments

Comments

@phymhan
Copy link

phymhan commented May 18, 2020

First off, huge thanks to Andy for the PyTorch implementation! However, I encountered a cublas error after a few iterations:
when using pytorch 1.5 with cuda 10.2 on two RTX 8000,
File "BigGAN-PyTorch/layers.py", line 40, in power_iteration
u = torch.matmul(v, W.t())
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)

when using pytorch 1.0.1 with cuda 10.0,
File "BigGAN-PyTorch/layers.py", line 48, in power_iteration
svs += [torch.squeeze(torch.matmul(torch.matmul(v, W.t()), u.t()))]
RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1549636813070/work/aten/src/THC/THCBlas.cu:258

PS: with some modifications the error disappears, for example, using vanilla bce loss instead of hinge loss, or removing the linear layer.

Any idea why this happens? Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant