Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metsi graph partition #1

Open
fzb0316 opened this issue May 14, 2023 · 2 comments
Open

metsi graph partition #1

fzb0316 opened this issue May 14, 2023 · 2 comments

Comments

@fzb0316
Copy link

fzb0316 commented May 14, 2023

I am a beginner of GNN, and I take the liberty to ask a question, please bear with me.
My question is based on this sentence from the paper 4.1: The graph partitions are generated such that the training, validation, and test vertices are balanced. As a heuristic to balance work-per-partition, an additional criterion is used to balance the amount of edges per partition.
After reading your paper "COMMUNICATION-EFFICIENT GRAPH NEURAL NETWORKS WITH PROBABILISTIC NEIGHBORHOOD EXPANSION ANALYSIS AND CACHING", I paid more attention to the operation of graph partitioning. May I ask what the additional criterion you use to balance the number of edges in each partition after you use the metis to consider the balance of each partition train/val/test point?
I originally thought that the metis graph partition algorithm considered the vertices and their neighbors more, and did not take into account the balance of the edges (or the degree of the vertices). If this is wrong, I hope you can correct me, thank you very much.

@timkaler
Copy link
Member

Thanks for the question. I'll note, first, that this paper is not principally concerned with the manner in which the graph is partitioned. As such, our approach to partitioning is very similar to that used by other distributed GNN systems (e.g. dgl). That said, I'll see if I can elaborate a bit on what we are doing for graph partitioning.

METIS tries to partition the graph in a way that minimizes edges crossing partitions while assigning an equal "weight" of vertices to each partition. In the simplest case, each vertex is assigned unit weight and METIS will try to generate graph partitions that have small edge cuts and have an equal number of vertices per partition.

Assigning each vertex unit weight, however, doesn't always make sense. For graph datasets, we ideally want to balance the training set vertices, the validation set vertices, and the test set vertices so that distributed computations on any of these vertex subsets is balanced. For the papers100M dataset, many vertices are not part of any train/valid/test subset --- making it all the more important to not simply balance vertices among partitions in a naive manner.

One can balance train/valid/test sets by using 3-dimensional vertex weights in METIS.

As you noted, our paper describes 4-dimensional vertex weights. The 4th vertex weight is meant to heuristically balance the amount of "work" per vertex during GNN training/inference. Here we assign non-unit weights to each vertex based on it's degree in the graph.

As a summary, we instruct METIS to generate a k-way partition of the graph that minimizes the edge-cut subject to the contraint that vertex weights are balanced within a tolerance. For the vertex weights, we use 4 weights to balance train/valid/test sets and the sum of vertex degrees within each partition.

You may be interested in looking at the artifact for the paper, which has documentation for how to reproduce experiments (and how to run our graph partitioning workflow). https://github.com/MITIBMxGraph/SALIENT_plusplus_artifact

I hope this is helpful, and apologies for any grammatical/spelling errors. I am presently on vacation without a laptop and am sending this from my phone.

@fzb0316
Copy link
Author

fzb0316 commented May 16, 2023

Thank you very much for your answer. I have learned that you use the degree of vertices as the weight input of metis, so that metis considers the distribution of edges while balancing vertices, which is very helpful to our work. I will continue to pay attention to your code and documentation.
Thank you for answering my questions after being busy, and wish you a happy vacation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants