Predictive GFM 2025 #318

allaffa · 2025-01-12T00:38:25Z

Main changes:

Consistency check across all large scale datasets for the format of the torch.geometric.data objects

              data_object = Data(
                   dataset_name="dataset_name",
                   natoms=natoms,
                   pos=pos,
                   cell=None,  # even if not needed, cell needs to be defined because ADIOS requires consistency across datasets
                   pbc=None,  # even if not needed, pbc needs to be defined because ADIOS requires consistency across datasets
                   edge_shifts=None,  # even if not needed, edge_shift needs to be defined because ADIOS requires consistency across datasets
                   atomic_numbers=atomic_numbers,  # Reshaping atomic_numbers to Nx1 tensor
                   chemical_composition=chemical_composition,
                   smiles_string=smiles_string,
                   x=x,
                   energy=energy,
                   energy_per_atom=energy_per_atom,
                   force=forces,
               )

apply graphgps_transform to compute structural and positional Laplacian encodings
Allow for parsed input argument choice to set compute_grad_energy. Default values is False
default value of energy_per_atom is set to False, because we do not need to normalize for machine learning force fields
Added chemical composition as one-dimensional vector with 118 entries. each entry counts the number of atoms for that chemical species in the atomistic structures

Transition1x

… formatting

…t to False

examples/ani1_x/train.py

examples/qm7x/train.py

…efinition of the Data object

examples/ani1_x/train.py

examples/qm7x/train.py

allaffa · 2025-01-16T17:30:57Z

@RylieWeaver @ArCho48 @zachfox
I added smiles_string as an attribute to each Data object. This attribute is set to None for inorganic compounds, for which the SMILES representation does not make sense, and also for those organic molecules for which xyz2mol struggles reconstructing the nature of the chemical bonds between atoms. This would require running quantum mechanical calculations, which obviously is insane to think about in this context.

allaffa · 2025-01-16T17:41:16Z

@pzhanggit
This PR corresponds to the branch where we will try to perform our imbalanced, multi-source work.
Please take a look at the structure of the Data objects.

allaffa · 2025-01-16T17:42:01Z

@zachfox
Whenever we will move ahead with the conditional DM, please take a look at the Data structures of this PR

examples/omat24/train.py

pzhanggit · 2025-01-23T14:41:43Z

@pzhanggit This PR corresponds to the branch where we will try to perform our imbalanced, multi-source work. Please take a look at the structure of the Data objects.

Thank you, Max @allaffa . Introducing a dataset_name ID looks good for our multi-source work. Let me know when you complete the datasets generation, and I'll start the multi-source model training.

About the changes in hydragnn/utils/descriptors_and_embeddings/smiles_utils.py, I suggest we move them to a standalone file.

pzhanggit

See my replies to Max's comment

allaffa · 2025-01-23T18:10:16Z

@pzhanggit This PR corresponds to the branch where we will try to perform our imbalanced, multi-source work. Please take a look at the structure of the Data objects.

Thank you, Max @allaffa . Introducing a dataset_name ID looks good for our multi-source work. Let me know when you complete the datasets generation, and I'll start the multi-source model training.

About the changes in hydragnn/utils/descriptors_and_embeddings/smiles_utils.py, I suggest we move them to a standalone file.

@pzhanggit
Thanks, I moved the functionalities in a separate xyz2mol.py script

allaffa and others added 17 commits January 10, 2025 21:59

data attributes updated for consistency across datasets

ba4cedc

non-normalized chemical composition added as data attribute

a7ab975

download_dataset.sh added for transition1x example

7292244

download dataset flag updated

c220207

scripts updated

7c7a5c3

development of tranistion1x scripts continues

9018d8a

transiton1x scripts completed

a1b6361

black formatting fixed

224295c

Merge pull request #21 from allaffa/transition1x

9b4dae2

Transition1x

printouts removed

47cf3a8

parallelizatin of data reading introduced

650dc35

blsck formatting fixed

c193db0

Merge pull request #22 from allaffa/transition1x

08a8d34

Transition1x

detach().clone() used to defined normalized energy per atom and black…

4c128e6

… formatting

add compute_grad_energy=False as explicit argument

8f76423

add data name as attributed to each data object

4407d66

compute_grad_energy is parsed as input argument with default value se…

f33cb8c

…t to False

allaffa added the enhancement New feature or request label Jan 12, 2025

allaffa requested review from jychoi-hpc, ArCho48 and RylieWeaver January 12, 2025 00:38

allaffa self-assigned this Jan 12, 2025

RylieWeaver reviewed Jan 12, 2025

View reviewed changes

examples/ani1_x/train.py Show resolved Hide resolved

RylieWeaver reviewed Jan 12, 2025

View reviewed changes

examples/qm7x/train.py Show resolved Hide resolved

edge_index, edge_attr, and edge_shifts explicitly itnroduced in the d…

78ea0cb

…efinition of the Data object

allaffa changed the title ~~Predictive gfm 2025~~ Predictive GFM 2025 Jan 12, 2025

allaffa requested a review from RylieWeaver January 12, 2025 22:08

RylieWeaver reviewed Jan 13, 2025

View reviewed changes

examples/ani1_x/train.py Outdated Show resolved Hide resolved

RylieWeaver reviewed Jan 13, 2025

View reviewed changes

examples/qm7x/train.py Outdated Show resolved Hide resolved

changed data.force into data.forces for ani1x and qm7x examples

3c3c434

allaffa requested a review from RylieWeaver January 13, 2025 14:05

RylieWeaver approved these changes Jan 13, 2025

View reviewed changes

smiles_string added as data attribute

4687734

allaffa requested a review from RylieWeaver January 16, 2025 17:28

allaffa requested a review from pzhanggit January 16, 2025 17:40

allaffa requested a review from zachfox January 16, 2025 17:41

RylieWeaver reviewed Jan 16, 2025

View reviewed changes

examples/omat24/train.py Outdated Show resolved Hide resolved

allaffa added 2 commits January 16, 2025 16:09

remove redundant logic on energy normalization from omat24 example

8a235c0

force threshold value increased to 1000 for ani-1x

28560e6

pzhanggit reviewed Jan 23, 2025

View reviewed changes

allaffa added 2 commits January 23, 2025 10:58

Reverted smiles_utils.py to version from commit 3c3c434

743f020

xyz2mol functionalities put in a separate file

b07be51

allaffa requested a review from pzhanggit January 23, 2025 18:10

download dataset script added for qm7x

ae1e214

pzhanggit approved these changes Jan 23, 2025

View reviewed changes

allaffa added 8 commits January 25, 2025 12:23

renamed data.force as data.forces in ani1x

34397d4

natoms converted into a tensor

8e81d75

verbosity level ntroduced for ani1x

baf6866

Z corrected into atomic_numbers for qm7x example

c185151

bug fixed for data attributes in transition1x

c1617db

try-except in transition1x rescoped

e7ed2ce

transform coordinates fixed in transition1x

493ba9d

iterate_tqdm used in utils.create_graph_data for transition1x example

e7cb7e3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predictive GFM 2025 #318

Predictive GFM 2025 #318

allaffa commented Jan 12, 2025 •

edited

Loading

allaffa commented Jan 16, 2025 •

edited

Loading

allaffa commented Jan 16, 2025

allaffa commented Jan 16, 2025

pzhanggit commented Jan 23, 2025 •

edited

Loading

pzhanggit left a comment

allaffa commented Jan 23, 2025

Predictive GFM 2025 #318

Are you sure you want to change the base?

Predictive GFM 2025 #318

Conversation

allaffa commented Jan 12, 2025 • edited Loading

allaffa commented Jan 16, 2025 • edited Loading

allaffa commented Jan 16, 2025

allaffa commented Jan 16, 2025

pzhanggit commented Jan 23, 2025 • edited Loading

pzhanggit left a comment

Choose a reason for hiding this comment

allaffa commented Jan 23, 2025

allaffa commented Jan 12, 2025 •

edited

Loading

allaffa commented Jan 16, 2025 •

edited

Loading

pzhanggit commented Jan 23, 2025 •

edited

Loading