Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Running LAMMPS-MACE in Parallel on CPU Using mpirun #843

Open
moonsheeeeeep opened this issue Mar 6, 2025 · 0 comments
Open

Error Running LAMMPS-MACE in Parallel on CPU Using mpirun #843

moonsheeeeeep opened this issue Mar 6, 2025 · 0 comments

Comments

@moonsheeeeeep
Copy link

Hello everyone,

I am attempting to run LAMMPS with the MACE model in parallel on a CPU version. However, when I use the command mpirun -np 2 lmp < in.lammps to execute my script, I encounter an error as follows:

LAMMPS (29 Aug 2024)
OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/comm.cpp:98)
using 1 OpenMP thread(s) per MPI task
Reading data file ...
...
RuntimeError: index 225 is out of bounds for dimension 0 with size 14

Interestingly, running the simulation in serial mode using mpirun -np 1 lmp < in.lammps works perfectly fine without any issues.

Details about the setup and issue:

The simulation runs smoothly with a single core (-np 1), but fails when trying to use multiple cores (-np 2).
The error seems to originate from within the TorchScript interpreter while processing the MACE model.
The specific error message indicates an index out of bounds exception, which suggests a potential issue with data handling or distribution across MPI processes.
Given that single-core execution works correctly, it appears there might be a problem with how the MACE model is being handled in a multi-core environment. Is this a known limitation of the current LAMMPS-MACE integration? Or could there be a configuration step I am missing to enable proper parallel execution?

Attached are all the files used in this computation example, including the input script (in.lammps), data files, and the MACE model.

Could anyone provide guidance or insights into resolving this issue? Any help would be greatly appreciated.

Thank you in advance!

Here is the content of my in.lammps file:

atom_style atomic
units real
boundary p p p
atom_modify map yes
neighbor 3.0 bin
neigh_modify every 1 delay 10 check yes
read_data lammps.data
mass 1 12 #C
mass 2 1 #H
mass 3 16 #O
pair_style mace no_domain_decomposition
pair_coeff * * mace02_com1_stagetwo_compiled.model-lammps.pt C H O
timestep 1
thermo 1
fix fix_nvt all nvt temp 273 273 100.0
compute myT all temp
thermo_style custom step temp
thermo_modify temp myT flush yes
dump dump_all all custom 10 traj.lammps id type element x y z vx vy vz fx fy fz
dump_modify dump_all format float %12.8f append yes element C H O
run 500
write_data out.data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant