Skip to content

Commit

Permalink
add in more edits from Brad
Browse files Browse the repository at this point in the history
  • Loading branch information
daico007 committed Jan 23, 2024
1 parent 64751c4 commit cff25e6
Showing 1 changed file with 6 additions and 4 deletions.
10 changes: 6 additions & 4 deletions paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,9 +94,9 @@ The Molecular Simulation Design Framework (MoSDeF) is a software suite tailored


The initialization of chemical/biomolecular systems comprises of three key steps:
1. Constructing structures: encompassing loading or creating molecules/structures that mirrors the phenomena under investigation.
2. Parameterizing: interactional parameters are assigned to all particles and connections within the structures.
3. Storing Structures and Output Generation: parameterized structures are stored, and output is generated in file formats compatible with various simulation software, including GROMACS, LAMMPS, HOOMD-Blue, Cassandra, and GOMC[@abraham2015gromacs; @thompson2022lammps; @anderson2010hoomd, @shah2017cassandra, @nejahi2021update].
1. Constructing structures: Encompassing loading or creating molecules/structures that mirrors the phenomena under investigation.
2. Parameterizing: Assigning interactional parameters to all particles and connections within the structures.
3. Storing Structures and Output Generation: Storing parameterized structures, and outputting to file formats compatible with various simulation software.
Each of these steps necessitates distinct routines, and as such, is addressed by a series of specialized libraries — specifically, mBuild [@klein2016hierarchical], Foyer [@klein2019formalizing], and GMSO, which is introduced in this work. mBuild functions as a molecular builder, equipped with extensive utilities for creating, loading, and manipulating positions of atoms and molecules, along with managing their connectivity through bonds[@klein2016hierarchical]. Foyer assumes the role of parameterizing for the created structures, involving the identification and assignment of interaction parameters to each atom or group of atoms[@klein2019formalizing] and their associated connections (e.g., bonds, angles, and dihedrals). This process entails matching the connectivity (bond graph) of the provided structure with the SMARTS grammar of the corresponding atom type, defining the interactional parameters[@klein2019formalizing]. The use of a graph matching method, departing from the traditional approach of matching via atom indices, allows for a more flexible parameterization. This feature proves particularly advantageous in the study of functionalized polymers, whose structures consistently deviate slightly from the standard polymer[@summers2020mosdef, @quach2022high]. These utilities have been applied in various projects to explore a wide range of structures and applications [@thompson2019scalable; @summers2020mosdef; @quach2022high; @ma2022dynamics], as well as integrated into other scientific libraries [@albooyeh2023flowermd; @defever2021mosdef; @crawford2023mosdefgomc].


Expand All @@ -107,7 +107,9 @@ The parameterization step introduces additional information, requiring a more so
- Being extensible (to support new simulation models/engines/workflows)
Currently, existing data structures, such as ParmEd and OpenMM[@shirts2016lessons; @eastmann2017openmm], fulfill many functionalities and are widely adopted [add citations]. However, their underlying structures are tailored to specific subsets of simulation workflows and ecosystems, as well as force field equation forms, sacrificing generality and broad applicability. This limitation includes hard-coding and assumptions about potential expressions and units. They lack the generality that MoSDeF and its users seek, such as the ability to define and store arbitrary potential expressions or unit systems. Integrating these new features into existing software, unfortunately, would require a major overhaul, potentially impacting existing simulation workflows and is not appealing to current project stakeholders.

Hence, we develop the General Molecular Simulation Object (GMSO) library, which is a lightweight, extensible data structure encapsulating chemical/biomolecular systems and their associated interaction parameters, i.e., force fields, to cater to MoSDeF ecosystem. The library is designed to accommodate a wide range of chemical/biomolecular models, offering the capability to support arbitrary potential expressions and unit systems. Generalizing these potential (force field) expressions also allows users to enter the force field in its native form and units, minimizing user error when setting up the force field file while providing the ability to easily auto-convert the potential form and units to the molecular engine's required form. GMSO satisfies the broader community's need for a general, extensible, and reproducible method of setting up molecular simulations. In addition to core data classes, the library includes routines for interacting/converting to and from other ecosystems, including ParmEd and OpenMM, enhancing interoperability without reinventing functionalities. GMSO supports output to multiple molecular simulation engine-specific file formats, currently including GROMACS, LAMMPS, HOOMD-Blue, GOMC, Cassandra, with plans for future expansion. When integrated with other MoSDeF software, GMSO facilitates large-scale automated molecular screening for diverse molecules/structures or state points, applying correct force field parameters using SMARTS strings[@klein2016hierarchical, @klein2019formalizing]. Combining MoSDeF software in a unified workflow using a manager like Signac [@adorf2018simple] enables large-scale molecular screenings critical for developing new materials, chemicals, and drugs[@quach2022high, @thompson2019scalable].

Hence, we developed the General Molecular Simulation Object (GMSO) library, which is a lightweight, extensible data structure encapsulating chemical/biomolecular systems and their associated interaction parameters, i.e., force fields, to cater to MoSDeF ecosystem. The library is designed to accommodate a wide range of chemical/biomolecular models, offering the capability to support arbitrary potential expressions and unit systems. Generalizing these potential (force field) expressions also allows users to enter the force field in its native form and units, minimizing user error when setting up the force field file while providing the ability to easily auto-convert the potential form and units to the molecular engine's required form. GMSO satisfies the broader community's need for a general, extensible, and reproducible method of setting up molecular simulations. In addition to core data classes, the library includes routines for interacting/converting to and from other ecosystems, including ParmEd and OpenMM, enhancing interoperability without reinventing functionalities. GMSO supports output to multiple molecular simulation engine-specific file formats, currently including , including GROMACS, LAMMPS, HOOMD-Blue, NAMD, Cassandra, and GOMC, with plans for future expansion[@abraham2015gromacs; @thompson2022lammps; @anderson2010hoomd, @phillips2020scalable, @shah2017cassandra, @nejahi2021update]. When integrated with other MoSDeF software, GMSO facilitates large-scale automated molecular screening for diverse molecules/structures or state points, applying correct force field parameters using SMARTS strings[@klein2016hierarchical, @klein2019formalizing]. Combining MoSDeF software in a unified workflow using a manager like Signac [@adorf2018simple] enables large-scale molecular screenings critical for developing new materials, chemicals, and drugs[@quach2022high, @thompson2019scalable].


# Acknowledgements
This research was partially supported by the National Science Foundation OAC-1835713 and OAC-1835874. Atomfold also donated research and development time and computational resources for this research and software.

0 comments on commit cff25e6

Please sign in to comment.