Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GraphBolt][Dataset] Contribute IGBH dataset to hetero examples. #7708

Open
wants to merge 56 commits into
base: master
Choose a base branch
from

Conversation

BowenYao18
Copy link
Collaborator

@BowenYao18 BowenYao18 commented Aug 15, 2024

Description

I added IGB-dataset to the RGCN folder. Currently the three smaller version (tiny[~2GB], small[~20GB], medium[~100GB]) are supported. The two larger versions (large[~800GB], full[~2.2TB]) are still under test.

Usage:

  1. Run the "download.py --size {choose the size}", the dataset will be downloaded and processed under the default ""dataset/"" folder
  2. Run the ""hetero_rgcn.py"" with the corresponding dataset you just processed

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • I've leverage the tools to beautify the python and c++ code.
  • [] The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

@dgl-bot
Copy link
Collaborator

dgl-bot commented Aug 15, 2024

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@dgl-bot
Copy link
Collaborator

dgl-bot commented Aug 15, 2024

Commit ID: 021341236243f6f7133a1adf3014b3b0b0499258

Build ID: 1

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Aug 15, 2024

Commit ID: afd0152bb1c5e9ea8b34c1e30e89b416f54821fa

Build ID: 2

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

examples/graphbolt/rgcn/download.py Outdated Show resolved Hide resolved
examples/graphbolt/rgcn/download.py Outdated Show resolved Hide resolved
examples/graphbolt/rgcn/download.py Outdated Show resolved Hide resolved
examples/graphbolt/rgcn/download.py Outdated Show resolved Hide resolved
@dgl-bot
Copy link
Collaborator

dgl-bot commented Aug 15, 2024

Commit ID: 5fc930b

Build ID: 3

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Aug 15, 2024

Commit ID: 06f3f29

Build ID: 4

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Aug 15, 2024

Commit ID: 19f18a5

Build ID: 5

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Aug 15, 2024

Commit ID: 97c1735

Build ID: 6

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Aug 15, 2024

Commit ID: 93cb70f

Build ID: 7

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Aug 15, 2024

Commit ID: b170a90

Build ID: 8

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Aug 15, 2024

Commit ID: 55079b8

Build ID: 9

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Aug 16, 2024

Commit ID: 23f65342bf2552118909addd1ed94e0e16fee8e7

Build ID: 10

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Sep 9, 2024

Commit ID: d17ccc264f13fac519ac41a598fc5e22ff4c8f31

Build ID: 37

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Sep 9, 2024

Commit ID: 8211a6e3a591222f356736f84bab4061c85d968d

Build ID: 38

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Sep 9, 2024

Commit ID: 94aadc5bca519209d1de06d940f8ca7b4a324357

Build ID: 39

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Sep 9, 2024

Commit ID: de7dc90

Build ID: 40

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Oct 7, 2024

Commit ID: 2537811

Build ID: 41

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Oct 9, 2024

Commit ID: 692912e

Build ID: 42

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Oct 9, 2024

Commit ID: 5dfd3fd

Build ID: 43

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this file. The same functionality can be implemented in a few lines.

"igb-het-medium",
"igb-het-large",
"igb-het",
"igb-het-MLPerf",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it all lower case?

The igb-hom-[tiny|small|medium] dataset is a heterogeneous citation network,
which is designed for developers to train and evaluate GNN models with
high fidelity. See more details in `igb-het-[tiny|small|medium]
high fidelity. See more details in `igb-het-[tiny|small|medium|large]
Copy link
Collaborator

@mfbalin mfbalin Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here as well. igb-het-[tiny|small|medium|large|mlperf] and igb-het.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Oct 18, 2024

Commit ID: 5c689e303debd66465258f96e39df1e4c9a8e014

Build ID: 44

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Oct 18, 2024

Commit ID: 1645605a665e2914c1a66b5fd7378c521665d182

Build ID: 45

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Oct 18, 2024

Commit ID: 0030a532f223cc33914caaa916550963d636d121

Build ID: 46

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@mfbalin
Copy link
Collaborator

mfbalin commented Oct 19, 2024

LGTM except for the overly complicated evaluator.py file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use pure pytorch operations to compute the evaluation score.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants