Multi-GPU training for DGL+ALIGNN #7372

knc6 · 2024-04-29T18:28:51Z

❓ Questions and Help

Hi,

Here is an issue regarding multi-GPU usage case for DGL for atomistic prediction in ALIGNN: usnistgov/alignn#90

We implemented the DDP feature, but still the RAM memory cost is a challenge. Any thoughts/suggestions how to tackle this issue efficiently?

rudongyu · 2024-05-09T02:26:34Z

Hi @knc6 , original dataset in older dgl put all features and graph structures in memory, and I guess your dataset is pretty large, which makes the RAM memory cost huge. You may try the graphbolt in the newest version of dgl: https://docs.dgl.ai/stochastic_training/ondisk-dataset.html. It supports on-disk storage of features.

knc6 · 2024-05-09T02:57:10Z

Thanks for the recommendation. I found that LMDB worked well dealing with large dataset.

knc6 closed this as completed May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-GPU training for DGL+ALIGNN #7372

Multi-GPU training for DGL+ALIGNN #7372

knc6 commented Apr 29, 2024

rudongyu commented May 9, 2024

knc6 commented May 9, 2024

Multi-GPU training for DGL+ALIGNN #7372

Multi-GPU training for DGL+ALIGNN #7372

Comments

knc6 commented Apr 29, 2024

❓ Questions and Help

rudongyu commented May 9, 2024

knc6 commented May 9, 2024