Skip to content

Commit

Permalink
Rename "structures" -> "systems" (#110)
Browse files Browse the repository at this point in the history
  • Loading branch information
PicoCentauri authored Feb 26, 2024
1 parent e294c38 commit 7265c52
Show file tree
Hide file tree
Showing 55 changed files with 427 additions and 447 deletions.
8 changes: 4 additions & 4 deletions docs/src/dev-docs/utils/data/readers/index.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
Structure and Target data Readers
system and Target data Readers
=================================

The main entry point for reading structure and target information are the two reader
The main entry point for reading system and target information are the two reader
functions

.. autofunction:: metatensor.models.utils.data.read_structures
.. autofunction:: metatensor.models.utils.data.read_systems
.. autofunction:: metatensor.models.utils.data.read_targets

Target type specific readers
Expand All @@ -28,5 +28,5 @@ these refer to their documentation
.. toctree::
:maxdepth: 1

structure
systems
targets
13 changes: 0 additions & 13 deletions docs/src/dev-docs/utils/data/readers/structure.rst

This file was deleted.

13 changes: 13 additions & 0 deletions docs/src/dev-docs/utils/data/readers/systems.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
system Readers
#################

Parsers for obtaining information from systems. All readers return a :py:class:`list`
of :py:class:`metatensor.torch.atomistic.System`. The mapping which reader is used for
which file type is stored in

.. autodata:: metatensor.models.utils.data.readers.systems.SYSTEM_READERS

Implemented Readers
-------------------

.. autofunction:: metatensor.models.utils.data.readers.systems.read_systems_ase
20 changes: 10 additions & 10 deletions docs/src/getting-started/custom_dataset_conf.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ parsing data for training. Mandatory sections in the `options.yaml` file include
- ``test_set``
- ``validation_set``

Each section can follow a similar structure, with shorthand methods available to
Each section can follow a similar system, with shorthand methods available to
simplify dataset definitions.

Minimal Configuration Example
Expand All @@ -36,7 +36,7 @@ format, which is also valid for initial input:
.. code-block:: yaml
training_set:
structures:
systems:
read_from: dataset.xyz
file_format: .xyz
length_unit: null
Expand All @@ -61,13 +61,13 @@ format, which is also valid for initial input:
Understanding the YAML Block
----------------------------
The ``training_set`` is divided into sections ``structures`` and ``targets``:
The ``training_set`` is divided into sections ``systems`` and ``targets``:

Structures Section
^^^^^^^^^^^^^^^^^^
Describes the structure data like positions and cell information.
Systems Section
^^^^^^^^^^^^^^^
Describes the system data like positions and cell information.

:param read_from: The file containing structure data.
:param read_from: The file containing system data.
:param file_format: The file format, guessed from the suffix if ``null`` or not
provided.
:param length_unit: The unit of lengths, optional but recommended for simulations.
Expand All @@ -93,7 +93,7 @@ Target section parameters include:

:param quantity: The target's quantity (e.g., ``energy``, ``dipole``). Currently only
``energy`` is supported.
:param read_from: The file for target data, defaults to the ``structures.read_from``
:param read_from: The file for target data, defaults to the ``systems.read_from``
file if not provided.
:param file_format: The file format, guessed from the suffix if not provided.
:param key: The key for reading from the file, defaulting to the target section's name
Expand Down Expand Up @@ -135,15 +135,15 @@ starting with a ``"- "`` (a dash and a space)
.. code-block:: yaml
training_set:
- structures:
- systems:
read_from: dataset_0.xyz
length_unit: angstrom
targets:
energy:
quantity: energy
key: my_energy_label0
unit: eV
- structures:
- systems:
read_from: dataset_1.xyz
length_unit: angstrom
targets:
Expand Down
2 changes: 1 addition & 1 deletion docs/src/getting-started/override.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ hyperparameters. The adjustments for ``num_epochs`` and ``cutoff`` look like thi
num_epochs: 200
training_set:
structures: "qm9_reduced_100.xyz"
systems: "qm9_reduced_100.xyz"
targets:
energy:
key: "U0"
Expand Down
4 changes: 2 additions & 2 deletions docs/src/getting-started/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ The sub-command to start a model training is
metatensor-models train
To train a model you have to define your options. This includes the specific
architecture you want to use and the data including the training structures and target
architecture you want to use and the data including the training systems and target
values

The default model and training hyperparameter for each model are listed in their
Expand Down Expand Up @@ -67,7 +67,7 @@ The sub-command to evaluate an already trained model is
metatensor-models eval
Besides the trained `model`, you will also have to provide a file containing the
structure and possible target values for evaluation. The structure of this ``eval.yaml``
system and possible target values for evaluation. The system of this ``eval.yaml``
is exactly the same as for a dataset in the ``options.yaml`` file.

.. literalinclude:: ../../static/qm9/eval.yaml
Expand Down
2 changes: 1 addition & 1 deletion docs/static/qm9/eval.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
structures: "qm9_reduced_100.xyz" # file where the positions are stored
systems: "qm9_reduced_100.xyz" # file where the positions are stored
targets:
energy:
key: "U0" # name of the target value
4 changes: 2 additions & 2 deletions docs/static/qm9/options.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@
architecture:
name: experimental.soap_bpnn

# Mandatory section defining the parameters for structure and target data of the
# Mandatory section defining the parameters for system and target data of the
# training set
training_set:
structures: "qm9_reduced_100.xyz" # file where the positions are stored
systems: "qm9_reduced_100.xyz" # file where the positions are stored
targets:
energy:
key: "U0" # name of the target value
Expand Down
2 changes: 1 addition & 1 deletion examples/alchemical_model/eval.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
structures: "alchemical_reduced_10.xyz" # file where the positions are stored
systems: "alchemical_reduced_10.xyz" # file where the positions are stored
targets:
energy:
key: "energy" # name of the target value
Expand Down
4 changes: 2 additions & 2 deletions examples/alchemical_model/options.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ architecture:
training:
num_epochs: 10

# Mandatory section defining the parameters for structure and target data of the
# Mandatory section defining the parameters for system and target data of the
# training set
training_set:
structures: "alchemical_reduced_10.xyz" # file where the positions are stored
systems: "alchemical_reduced_10.xyz" # file where the positions are stored
targets:
energy:
key: "energy" # name of the target value
Expand Down
4 changes: 2 additions & 2 deletions examples/ase/options.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ architecture:
num_epochs: 100
learning_rate: 0.01

# Section defining the parameters for structure and target data
# Section defining the parameters for system and target data
training_set:
structures: "ethanol_reduced_100.xyz"
systems: "ethanol_reduced_100.xyz"
targets:
energy:
key: "energy"
Expand Down
8 changes: 4 additions & 4 deletions examples/ase/run_ase.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
This tutorial demonstrates how to use an already trained and exported model to run an
ASE simulation of a single ethanol molecule in vacuum. We use a model that was trained
using the :ref:`architecture-soap-bpnn` architecture on 100 ethanol structures
using the :ref:`architecture-soap-bpnn` architecture on 100 ethanol systems
containing energies and forces. You can obtain the :download:`dataset file
<ethanol_reduced_100.xyz>` used in this example from our website. The dataset is a
subset of the `rMD17 dataset
Expand Down Expand Up @@ -148,8 +148,8 @@

# %%
#
# Inspect the structures
# ######################
# Inspect the systems
# ###################
#
# Even though the total energy is conserved, we also have to verify that the ethanol
# molecule is stable and the bonds did not break.
Expand All @@ -165,7 +165,7 @@
# As a final analysis we also calculate and plot the carbon-hydrogen radial distribution
# function (RDF) from the trajectory and compare this to the RDF from the training set.
#
# To use the RDF code from ase we first have to define a unit cell for our structures.
# To use the RDF code from ase we first have to define a unit cell for our systems.
# We choose a cubic one with a side length of 10 Å.

for atoms in training_frames:
Expand Down
2 changes: 1 addition & 1 deletion examples/basic_usage/usage.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ metatensor-models train --help
metatensor-models eval model.pt eval.yaml

# The evaluation command predicts those properties the model was trained against; here
# "U0". The predictions together with the structures have been written in a file named
# "U0". The predictions together with the systems have been written in a file named
# ``output.xyz`` in the current directory. The written file starts with the following
# lines

Expand Down
40 changes: 19 additions & 21 deletions src/metatensor/models/cli/eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
from omegaconf import DictConfig, OmegaConf

from ..utils.compute_loss import compute_model_loss
from ..utils.data import collate_fn, read_structures, read_targets, write_predictions
from ..utils.data import collate_fn, read_systems, read_targets, write_predictions
from ..utils.errors import ArchitectureError
from ..utils.extract_targets import get_outputs_dict
from ..utils.info import finalize_aggregated_info, update_aggregated_info
Expand Down Expand Up @@ -63,18 +63,18 @@ def _add_eval_model_parser(subparser: argparse._SubParsersAction) -> None:

def _eval_targets(model, dataset: Union[_BaseDataset, torch.utils.data.Subset]) -> None:
"""Evaluate an exported model on a dataset and print the RMSEs for each target."""
# Attach neighbor lists to the structures:
# Attach neighbor lists to the systems:
requested_neighbor_lists = model.requested_neighbors_lists()
# working around https://github.com/lab-cosmo/metatensor/issues/521
# Desired:
# for structure, _ in dataset:
# attach_neighbor_lists(structure, requested_neighbors_lists)
# for system, _ in dataset:
# attach_neighbor_lists(system, requested_neighbors_lists)
# Current:
dataloader = torch.utils.data.DataLoader(
dataset, batch_size=1, collate_fn=collate_fn
)
for (structure,), _ in dataloader:
get_system_with_neighbors_lists(structure, requested_neighbor_lists)
for (system,), _ in dataloader:
get_system_with_neighbors_lists(system, requested_neighbor_lists)

# Extract all the possible outputs and their gradients from the dataset:
outputs_dict = get_outputs_dict([dataset])
Expand Down Expand Up @@ -103,8 +103,8 @@ def _eval_targets(model, dataset: Union[_BaseDataset, torch.utils.data.Subset])
# Compute the RMSEs:
aggregated_info: Dict[str, Tuple[float, int]] = {}
for batch in dataloader:
structures, targets = batch
_, info = compute_model_loss(loss_fn, model, structures, targets)
systems, targets = batch
_, info = compute_model_loss(loss_fn, model, systems, targets)
aggregated_info = update_aggregated_info(aggregated_info, info)
finalized_info = finalize_aggregated_info(aggregated_info)

Expand Down Expand Up @@ -182,45 +182,43 @@ def eval_model(
file_index_suffix = f"_{i}"
logger.info(f"Evaulate dataset{extra_log_message}")

eval_structures = read_structures(
filename=options["structures"]["read_from"],
fileformat=options["structures"]["file_format"],
eval_systems = read_systems(
filename=options["systems"]["read_from"],
fileformat=options["systems"]["file_format"],
)

# Predict targets
if hasattr(options, "targets"):
eval_targets = read_targets(options["targets"])
eval_dataset = Dataset(
structure=eval_structures, energy=eval_targets["energy"]
)
eval_dataset = Dataset(system=eval_systems, energy=eval_targets["energy"])
_eval_targets(model, eval_dataset)
else:
# TODO: batch this
# TODO: add forces/stresses/virials if requested
# Attach neighbors list to structures. This step is only required if no
# Attach neighbors list to systems. This step is only required if no
# targets are present. Otherwise, the neighbors list have been already
# attached in `_eval_targets`.
eval_structures = [
eval_systems = [
get_system_with_neighbors_lists(
structure, model.requested_neighbors_lists()
system, model.requested_neighbors_lists()
)
for structure in eval_structures
for system in eval_systems
]

# Predict structures
# Predict systems
try:
# `length_unit` is only required for unit conversions in MD engines and
# superflous here.
eval_options = ModelEvaluationOptions(
length_unit="", outputs=model.capabilities().outputs
)
predictions = model(eval_structures, eval_options, check_consistency=True)
predictions = model(eval_systems, eval_options, check_consistency=True)
except Exception as e:
raise ArchitectureError(e)

# TODO: adjust filename accordinglt
write_predictions(
filename=f"{output.stem}{file_index_suffix}{output.suffix}",
predictions=predictions,
structures=eval_structures,
systems=eval_systems,
)
Loading

0 comments on commit 7265c52

Please sign in to comment.