Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update dgl to cuda 12.4 pytorch 2.4.x got error "FileNotFoundError: Cannot find DGL C++ sparse library at /opt/conda/envs/torch124/lib/python3.11/site-packages/dgl/dgl_sparse/libdgl_sparse_pytorch_2.4.0.post301.so" #7791

Open
NicksonCheng opened this issue Sep 9, 2024 · 6 comments

Comments

@NicksonCheng
Copy link

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

Expected behavior

Environment

  • DGL Version (e.g., 1.0):
  • Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3):
  • OS (e.g., Linux):
  • How you installed DGL (conda, pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version (if applicable):
  • GPU models and configuration (e.g. V100):
  • Any other relevant information:

Additional context

update dgl to cuda 12.4 pytorch 2.4.x got error "FileNotFoundError: Cannot find DGL C++ sparse library at /opt/conda/envs/torch124/lib/python3.11/site-packages/dgl/dgl_sparse/libdgl_sparse_pytorch_2.4.0.post301.so"

@jansole
Copy link

jansole commented Sep 17, 2024

hey! i have the same problem. i'm working on a windows machine in colab in python 3.10.

i have relatively no issues installing it:

Installing collected packages: torchdata, dgl
Successfully installed dgl-2.1.0+cu121 torchdata-0.8.0

but when i execute the code i get this:

FileNotFoundError Traceback (most recent call last)
in <cell line: 1>()
----> 1 import dgl
2 import networkx as nx
3 import matplotlib.pyplot as plt
4
5 dgl.backend = 'pytorch'

6 frames
/usr/local/lib/python3.10/dist-packages/dgl/graphbolt/init.py in load_graphbolt()
43 path = os.path.join(dirname, "graphbolt", basename)
44 if not os.path.exists(path):
---> 45 raise FileNotFoundError(
46 f"Cannot find DGL C++ graphbolt library at {path}"
47 )

FileNotFoundError: Cannot find DGL C++ graphbolt library at /usr/local/lib/python3.10/dist-packages/dgl/graphbolt/libgraphbolt_pytorch_2.4.0.so

@kjczarne
Copy link

I think something is wrong with pip wheels provided for this project. I am getting the same error, where the libgraphbolt_pytorch_2.4.1.so is not being found and indeed, when listing that directory, the library is missing. The documentation is misleading because it suggests that you can install this without any issues using pip and that is not the case.

I only got the Conda installation to work but this is quite frustrating when you're working in a Docker container, to not be able to install this via vanilla pip.

@kjczarne
Copy link

In the Jenkinsfile the libraries that seem to be copied at build are defined as:

dgl_linux_libs = 'build/libdgl.so, build/runUnitTests, python/dgl/_ffi/_cy3/core.cpython-*-x86_64-linux-gnu.so, build/tensoradapter/pytorch/*.so, build/dgl_sparse/*.so, build/graphbolt/*.so'

I suspect then something is broken with graphbolt libs, perhaps only those for older versions of PyTorch are built when the workflow is triggered. Here are those that I have found in the installed dgl package:

libgraphbolt_pytorch_2.0.0.so
libgraphbolt_pytorch_2.0.1.so
libgraphbolt_pytorch_2.1.0.so
libgraphbolt_pytorch_2.1.1.so
libgraphbolt_pytorch_2.1.2.so
libgraphbolt_pytorch_2.2.0.so
libgraphbolt_pytorch_2.2.1.so

@Rhett-Ying
Copy link
Collaborator

the issue is from .post301, this suffix is not expected. please try to re-install the latest torch version and torch.__verison__ is supposed to be 2.4.1 without any suffix.

@gaowayne
Copy link

In the Jenkinsfile the libraries that seem to be copied at build are defined as:

dgl_linux_libs = 'build/libdgl.so, build/runUnitTests, python/dgl/_ffi/_cy3/core.cpython-*-x86_64-linux-gnu.so, build/tensoradapter/pytorch/*.so, build/dgl_sparse/*.so, build/graphbolt/*.so'

I suspect then something is broken with graphbolt libs, perhaps only those for older versions of PyTorch are built when the workflow is triggered. Here are those that I have found in the installed dgl package:

libgraphbolt_pytorch_2.0.0.so
libgraphbolt_pytorch_2.0.1.so
libgraphbolt_pytorch_2.1.0.so
libgraphbolt_pytorch_2.1.1.so
libgraphbolt_pytorch_2.1.2.so
libgraphbolt_pytorch_2.2.0.so
libgraphbolt_pytorch_2.2.1.so

how you solve this? I met the same issue.

Copy link

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants