
Our new ArXiv release include comparison with a concurrent work (Jiong et. al. 2023 https://arxiv.org/pdf/2305.09887.pdf) which independently presents similar ideas, among other SOTA-distributed GNN training works.
More specifically, we summarize our key differences with Jiong et. al. as follows:
- Our work optimizes model performance with/without distributed data parallelism by interpolating soup GNN candidate weights. On the other hand, Jiong et. al. 2023 improves performance for data-parallel GNN training with model averaging and randomized partition of graphs.
- Our candiate models are interpolated only after training to facilitate diversity required for soup while Jiong et. al. weights are periodically averaged during training based on a time interval.
- Our soup ingredients are trained by sampling different clusters per epoch on the full graph while Jiong et. al. individual trainers use localized subgraph assigned by randomized node/super-node partitions.
For more detailed discussion, please refer to Section 4.1 of our new ArXiv (https://arxiv.org/abs/2306.10466).


