SCARA-PPR

This is the original code for "SCARA: Scalable Graph Neural Networks with Feature-Oriented Optimization" (VLDB 2022) and "Scalable Decoupling Graph Neural Networks with Feature-Oriented Optimization" (VLDBJ 2023).

Paper - VLDB | Paper - VLDBJ | GitHub | Tech Report | arXiv

Citation

If you find this work useful, please cite our papers:

VLDBJ:

Ningyi Liao, Dingheng Mo, Siqiang Luo, Xiang Li, and Pengcheng Yin.
Scalable Decoupling Graph Neural Networks with Feature-Oriented Optimization.
The VLDB Journal, 33, 2023. doi:10.1007/s00778-023-00829-6.

@article{liao2023scalable,
  title={Scalable Decoupling Graph Neural Networks with Feature-Oriented Optimization},
  author={Liao, Ningyi and Mo, Dingheng and Luo, Siqiang and Li, Xiang and Yin, Pengcheng},
  journal={The {VLDB} Journal},
  volume={33},
  year={2023},
  publisher={Springer},
  url={https://link.springer.com/article/10.1007/s00778-023-00829-6},
  doi={10.1007/s10994-021-06049-9}
}

VLDB:

Ningyi Liao, Dingheng Mo, Siqiang Luo, Xiang Li, and Pengcheng Yin.
SCARA: Scalable Graph Neural Networks with Feature-Oriented Optimization.
PVLDB, 15(11): 3240-3248, 2022. doi:10.14778/3551793.3551866.

@article{liao2022scara,
  title={{SCARA}: Scalable Graph Neural Networks with Feature-Oriented Optimization},
  author={Liao, Ningyi and Mo, Dingheng and Luo, Siqiang and Li, Xiang and Yin, Pengcheng},
  journal={Proceedings of the VLDB Endowment},
  volume={15},
  number={11},
  pages={3240-3248},
  year={2022},
  publisher={VLDB Endowment},
  url = {https://doi.org/10.14778/3551793.3551866},
}

Usage

We provide a complete example and its log in the demo notebook. The sample PubMed dataset is available in the data folder.

Data Preparation

Download data (links below) in GBP format to path data/[dataset_name]. Similar to the PubMed dataset example, there are three files:

adj.txt: adjacency table
- First line: "# [number of nodes]"
feats.npy: features in .npy array
labels.npz: node label information
- 'label': labels (number or one-hot)
- 'idx_train/idx_val/idx_test': indices of training/validation/test nodes (inductive task)

Run command python data_processor.py to generate additional processed files:

degrees.npz: node degrees in .npz 'arr_0'
feats_norm.npy: normalized features in .npy array
- Large matrix can be split
query.txt: indices of queried nodes

Precompute

Environment: CMake 3.16, C++ 14. Dependencies: eigen3
CMake cmake -B build, then make
Run script: ./run_pubmed.sh

Train and Test

Install dependencies: conda create --name [envname] --file requirements.txt
Run experiment: python run.py -f [seed] -c [config_file] -v [device]

Baseline Models

GraphSAINT: GraphSAINT
APPNP: APPNP
PPRGo: PPRGo
GBP: GBP
AGP: AGP
GAS: GAS

Dataset Links

Citeseer & Pubmed: GBP
PPI: GraphSAGE
Yelp: GraphSAINT
Reddit: PPRGo
Products & Papers100M: OGB
Amazon: Cluster-GCN
MAG: PANE

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
Precompute		Precompute
Train		Train
data_demo/pubmed		data_demo/pubmed
.gitignore		.gitignore
LICENSE.rtf		LICENSE.rtf
README.md		README.md
demo.ipynb		demo.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SCARA-PPR

Citation

VLDBJ:

VLDB:

Usage

Data Preparation

Precompute

Train and Test

Baseline Models

Dataset Links

About

Releases 1

Packages

Languages

License

gdmnl/SCARA-PPR

Folders and files

Latest commit

History

Repository files navigation

SCARA-PPR

Citation

VLDBJ:

VLDB:

Usage

Data Preparation

Precompute

Train and Test

Baseline Models

Dataset Links

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages