Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ATS-2095/ZEN-30304 - expanse/0.17.3/gpu/b - Missing LAMMPS GPU (example application) #56

Open
nwolter opened this issue Apr 11, 2023 · 31 comments

Comments

@nwolter
Copy link

nwolter commented Apr 11, 2023

No description provided.

@mkandes
Copy link
Member

mkandes commented May 5, 2023

I'm still having runtime problems with the GPU-enabled version(s) of LAMMPS.

@mkandes mkandes changed the title SDSC: PKG - expanse/0.17.3/gpu/a - Missing LAMMPS GPU (example application) SDSC: PKG - expanse/0.17.3/gpu/b - Missing LAMMPS GPU (example application) May 5, 2023
@mkandes
Copy link
Member

mkandes commented Aug 2, 2023

User request now. Changed issue subject line to include ACCESS ticket.

@mkandes mkandes changed the title SDSC: PKG - expanse/0.17.3/gpu/b - Missing LAMMPS GPU (example application) ATS-2095 - expanse/0.17.3/gpu/b - Missing LAMMPS GPU (example application) Aug 2, 2023
@mkandes mkandes changed the title ATS-2095 - expanse/0.17.3/gpu/b - Missing LAMMPS GPU (example application) ATS-2095/ZEN-30304 - expanse/0.17.3/gpu/b - Missing LAMMPS GPU (example application) Sep 1, 2023
@mkandes
Copy link
Member

mkandes commented Sep 7, 2023

Reconfirming that we currently only have LAMMPS deployed in the expanse/0.17.3/cpu/b production instance.

[mkandes@login02 ~]$ module spider lammps

----------------------------------------------------------------------------
  lammps:
----------------------------------------------------------------------------
     Versions:
        lammps/20200721-kokkos
        lammps/20200721-openblas-plumed
        lammps/20200721-openblas
        lammps/20200721
     Other possible modules matches:
        lammps/20210310

----------------------------------------------------------------------------
  To find other possible module matches execute:

      $ module -r spider '.*lammps.*'

----------------------------------------------------------------------------
  For detailed information about a specific "lammps" package (including how to load the modules) use the module's full name. Note that names that have a trailing (E) are extensions provided by other modules.
  For example:

[mkandes@login02 ~]$ module spider lammps/20210310

----------------------------------------------------------------------------
  lammps/20210310:
----------------------------------------------------------------------------
     Versions:
        lammps/20210310/jqd2pok-omp
        lammps/20210310/k7oi5an-omp

----------------------------------------------------------------------------
  For detailed information about a specific "lammps/20210310" package (including how to load the modules) use the module's full name. Note that names that have a trailing (E) are extensions provided by other modules.
  For example:

     $ module spider lammps/20210310/k7oi5an-omp
----------------------------------------------------------------------------

 

[mkandes@login02 ~]$ module spider lammps/20210310/jqd2pok-omp

----------------------------------------------------------------------------
  lammps/20210310: lammps/20210310/jqd2pok-omp
----------------------------------------------------------------------------

    You will need to load all module(s) on any one of the lines below before the "lammps/20210310/jqd2pok-omp" module is available to load.

      cpu/0.17.3b  gcc/10.2.0/npcyll4  openmpi/4.1.3/oq3qvsv
 
    Help:
      LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel
      Simulator. This package uses patch releases, not stable release. See
      https://github.com/spack/spack/pull/5342 for a detailed discussion.
      


 

[mkandes@login02 ~]$ module spider lammps/20210310/k7oi5an-omp

----------------------------------------------------------------------------
  lammps/20210310: lammps/20210310/k7oi5an-omp
----------------------------------------------------------------------------

    You will need to load all module(s) on any one of the lines below before the "lammps/20210310/k7oi5an-omp" module is available to load.

      cpu/0.17.3b  aocc/3.2.0/io3s466  openmpi/4.1.3/xigazqd
 
    Help:
      LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel
      Simulator. This package uses patch releases, not stable release. See
      https://github.com/spack/spack/pull/5342 for a detailed discussion.
      


 

[mkandes@login02 ~]$

@mkandes
Copy link
Member

mkandes commented Sep 7, 2023

Documenting the complete spec for both versions currently available in expanse/0.17.3/cpu/b for comparison.

[spack_cpu@login02 ~]$ spack find -lvdN lammps
==> 2 installed packages
-- linux-rocky8-zen2 / [email protected] -------------------------------
k7oi5an builtin.lammps@20210310+asphere~body+class2~colloid~compress~coreshell~cuda~cuda_mps~dipole~exceptions+ffmpeg+granular~ipo+jpeg~kim~kokkos+kspace~latte+lib+manybody~mc~meam~misc~mliap+molecule+mpi+mpiio~opencl+openmp+opt~peri+png~poems~python~qeq+replica+rigid~shock~snap~spin~srd~user-adios+user-atc~user-awpmd~user-bocs~user-cgsdk~user-colvars~user-diffraction~user-dpd~user-drude~user-eff~user-fep~user-h5md~user-lb~user-manifold~user-meamc~user-mesodpd~user-mesont~user-mgpt~user-misc~user-mofff~user-netcdf~user-omp~user-phonon~user-plumed~user-ptm~user-qtb~user-reaction~user-reaxc~user-sdpd~user-smd~user-smtbq~user-sph~user-tally~user-uef~user-yaff~voronoi build_type=RelWithDebInfo cuda_arch=none
6sfatsa     [email protected]+blas+cblas~ilp64+shared+static threads=none
kiytcz3         [email protected]+bz2+ctypes+dbm~debug+libxml2+lzma~nis~optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3~ssl~tix~tkinter~ucs4+uuid+zlib patches=0d98e93189bc278fbc37a50ed7f183bd8aaf249a8e1670a465f0db6bb4f8cf87,4c2457325f2b608b1b6a2c63087df8c26e07db3e3d493caf36a56f0ecf6fb768,f2fd060afc4b4618fe8104c4c5d771f36dc55b1db5a4623785a4ea707ec72fb4
pvpmqs4             [email protected]~debug~pic+shared
d4wiitw             [email protected]+libbsd
zpca44r                 [email protected]
tlqbjjl                     [email protected]
fwban2n             [email protected]
zcvyw4d                 [email protected]
vrkfn5j                     [email protected]~symlinks+termlib abi=none
a5ype3y             [email protected]+bzip2+curses+git~libunistring+libxml2+tar+xz
wsl5g6s                 [email protected] libs=shared,static
6xxubr4                 [email protected]~python
pc3ghll                     [email protected]~pic libs=shared,static
4flcxn7                     [email protected]+optimize+pic+shared
zn3a5tk                 [email protected]
34w374u             [email protected] patches=26f26c6f29a7ce9bf370ad3ab2610f99365b4bdd7b82e7c31df41a3370d685c0
5h6inaj             [email protected]+column_metadata+fts~functions~rtree
bj7ksgi             [email protected]
t757st4     [email protected]~amd-app-opt~amd-fast-planner~amd-mpi-vader-limit~amd-top-n-planner~amd-trans~debug~mpi~openmp+shared~static~threads precision=double,float
qn3awdo         [email protected] patches=12f6edb0c6b270b8c8dba2ce17998c580db01182d871ee32b7b6e4129bd1d23a,1732115f651cff98989cb0215d8f64da5e0f7911ebf0c13b064920f088f2ffe1
cytiibn             [email protected]+cpanm+shared+threads
cthafj2                 [email protected]+cxx~docs+stl patches=b231fcc4d5cff05e5c3a4814f6a5af0e9a966428dc2176540d2c05aff41de522
f37pl2z     [email protected]~debug~ilp64+lapack2flame+shared+static threads=none
q3jagl7     [email protected]~X~avresample+bzlib~drawtext+gpl~libaom~libmp3lame~libopenjpeg~libopus~libsnappy~libspeex~libssh~libvorbis~libvpx~libwebp~libzmq~lzma~nonfree~openssl~sdl2+shared+version3
qlxg434         [email protected]~python
m7b5ua5         [email protected]
fjebzdx     [email protected]
uqemmdz     [email protected]
xigazqd     [email protected]~atomics~cuda~cxx~cxx_exceptions~gpfs~internal-hwloc~java+legacylaunchers+lustre~memchecker+pmi+pmix+romio~rsh~singularity+static+vt+wrapper-rpath cuda_arch=none fabrics=ucx schedulers=slurm
2ewlbuo         [email protected]~cairo~cuda~gl~libudev+libxml2~netloc~nvml~opencl+pci~rocm+shared
whgqok2             [email protected]
tax5liq         [email protected]~openssl
soqjoas         [email protected]
g44vo3y         [email protected] patches=4e1d78cbbb85de625bad28705e748856033eaafab92a66dffd383a3d7e00cc94,62fc8a8bf7665a60e8f4c93ebbd535647cebf74198f7afafec4c085a8825c006,ff37630df599cfabf0740518b91ec8daaf18e8f288b19adaae5364dc1f6b2296
y33fcsl         [email protected]~docs+pmi_backwards_compatibility~restful
3r2kfj2         [email protected]~gtk~hdf5~hwloc~mariadb~pmix+readline~restd sysconfdir=PREFIX/etc
wla3unl         [email protected]~assertions~cm+cma~cuda+dc~debug+dm~gdrcopy+ib-hw-tm~java~knem~logging+mlx5-dv+optimizations~parameter_checking+pic+rc~rocm+thread_multiple+ud~xpmem cuda_arch=none
wbadl55             [email protected]~ipo build_type=RelWithDebInfo


-- linux-rocky8-zen2 / [email protected] -------------------------------
jqd2pok builtin.lammps@20210310+asphere+body+class2+colloid+compress+coreshell~cuda~cuda_mps+dipole~exceptions+ffmpeg+granular~ipo+jpeg+kim+kokkos+kspace~latte+lib+manybody+mc~meam+misc+mliap+molecule+mpi+mpiio~opencl+openmp+opt+peri+png+poems+python+qeq+replica+rigid+shock+snap+spin+srd~user-adios+user-atc+user-awpmd+user-bocs+user-cgsdk+user-colvars+user-diffraction+user-dpd+user-drude+user-eff+user-fep~user-h5md+user-lb+user-manifold+user-meamc+user-mesodpd+user-mesont+user-mgpt+user-misc+user-mofff~user-netcdf~user-omp+user-phonon+user-plumed+user-ptm+user-qtb+user-reaction+user-reaxc+user-sdpd+user-smd+user-smtbq+user-sph+user-tally+user-uef+user-yaff+voronoi build_type=RelWithDebInfo cuda_arch=none
5dsu2mu     [email protected]~ipo build_type=RelWithDebInfo
vac6d7f     [email protected]~X~avresample+bzlib~drawtext+gpl~libaom~libmp3lame~libopenjpeg~libopus~libsnappy~libspeex~libssh~libvorbis~libvpx~libwebp~libzmq~lzma~nonfree~openssl~sdl2+shared+version3
aokamdw         [email protected]~python
pulggjv         [email protected]~debug~pic+shared
zduoj2d         [email protected] libs=shared,static
ci3xr27         [email protected]
ws4iari         [email protected]+optimize+pic+shared
qogw3ss     [email protected]~mpi~openmp~pfft_patches precision=double,float
2y36ibi     [email protected]~ipo build_type=RelWithDebInfo
7pijxsl     [email protected]~aggressive_vectorization~compiler_warnings~cuda~cuda_constexpr~cuda_lambda~cuda_ldg_intrinsic~cuda_relocatable_device_code~cuda_uvm~debug~debug_bounds_check~debug_dualview_modify_check~deprecated_code~examples~explicit_instantiation~hpx~hpx_async_dispatch~hwloc~ipo~memkind~numactl~openmp+pic+profiling~profiling_load_print~pthread~qthread~rocm+serial+shared~sycl~tests~tuning~wrapper amdgpu_target=none build_type=RelWithDebInfo cuda_arch=none std=14
wkg4tg4     [email protected]
axgroxm     [email protected]
fgk2tlu     [email protected]~bignuma~consistent_fpcsr~ilp64+locking+pic+shared threads=none
oq3qvsv     [email protected]~atomics~cuda~cxx~cxx_exceptions~gpfs~internal-hwloc~java+legacylaunchers+lustre~memchecker+pmi+pmix+romio~rsh~singularity+static+vt+wrapper-rpath cuda_arch=none fabrics=ucx schedulers=slurm
7rqkdv4         [email protected]~cairo~cuda~gl~libudev+libxml2~netloc~nvml~opencl+pci~rocm+shared
ykynzrw             [email protected]
mgovjpj             [email protected]~python
paz7hxz                 [email protected]~pic libs=shared,static
5lhvslt             [email protected]~symlinks+termlib abi=none
bimlmtn         [email protected]~openssl
fy2cjdg         [email protected]
ckhyr5e         [email protected] patches=4e1d78cbbb85de625bad28705e748856033eaafab92a66dffd383a3d7e00cc94,62fc8a8bf7665a60e8f4c93ebbd535647cebf74198f7afafec4c085a8825c006,ff37630df599cfabf0740518b91ec8daaf18e8f288b19adaae5364dc1f6b2296
dpvrfip         [email protected]~docs+pmi_backwards_compatibility~restful
4kvl3fd         [email protected]~gtk~hdf5~hwloc~mariadb~pmix+readline~restd sysconfdir=PREFIX/etc
dnpjjuc         [email protected]~assertions~cm+cma~cuda+dc~debug+dm~gdrcopy+ib-hw-tm~java~knem~logging+mlx5-dv+optimizations~parameter_checking+pic+rc~rocm+thread_multiple+ud~xpmem cuda_arch=none
xjr3cuj             [email protected]~ipo build_type=RelWithDebInfo
gbcpghx     [email protected]+gsl+mpi+shared arrayfire=none optional_modules=all
wtlsmyy         [email protected]~external-cblas
7zdjza7     [email protected]+bz2+ctypes+dbm~debug+libxml2+lzma~nis+optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3+ssl~tix~tkinter~ucs4+uuid+zlib patches=0d98e93189bc278fbc37a50ed7f183bd8aaf249a8e1670a465f0db6bb4f8cf87,4c2457325f2b608b1b6a2c63087df8c26e07db3e3d493caf36a56f0ecf6fb768,f2fd060afc4b4618fe8104c4c5d771f36dc55b1db5a4623785a4ea707ec72fb4
tawwsnw         [email protected]+libbsd
wblxldx             [email protected]
rgboqoh                 [email protected]
clf6bmr         [email protected]
clxlnwz             [email protected]
ey3k6dv         [email protected]+bzip2+curses+git~libunistring+libxml2+tar+xz
e2brhcb             [email protected]
5oh6vxq         [email protected] patches=26f26c6f29a7ce9bf370ad3ab2610f99365b4bdd7b82e7c31df41a3370d685c0
v3ycaao         [email protected]~docs certs=system
fxmvvsx         [email protected]+column_metadata+fts~functions~rtree
vncpkij         [email protected]
ntgz4ab     [email protected]+pic

[spack_cpu@login02 ~]$

@mkandes
Copy link
Member

mkandes commented Sep 7, 2023

Again, reconfirming there is currently no LAMMPS package installed within expanse/0.17.3/gpu/b.

spack_gpu@login01 ~]$ . /cm/shared/apps/spack/0.17.3/gpu/b/share/spack/setup-env.sh
[spack_gpu@login01 ~]$ spack find -lvdN lammps
==> No package matches the query: lammps
[spack_gpu@login01 ~]$

@mkandes
Copy link
Member

mkandes commented Sep 7, 2023

This is the last spec build script configuration built and tested, which suffered from a persistent runtime error on the standard LJ benchmark.

[spack_gpu@exp-15-57 [email protected]]$ pwd
/cm/shared/apps/spack/0.17.3/gpu/b/etc/spack/sdsc/expanse/0.17.3/gpu/b/specs/[email protected]/[email protected]
[spack_gpu@exp-15-57 [email protected]]$ cat [email protected] 
#!/usr/bin/env bash

#SBATCH --job-name=lammps@20210310
#SBATCH --account=use300
#SBATCH --reservation=root_73
#SBATCH --partition=ind-gpu-shared
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=10
#SBATCH --mem=93G
#SBATCH --gpus=1
#SBATCH --time=24:00:00
#SBATCH --output=%x.o%j.%N

declare -xr LOCAL_TIME="$(date +'%Y%m%dT%H%M%S%z')"
declare -xir UNIX_TIME="$(date +'%s')"

declare -xr LOCAL_SCRATCH_DIR="/scratch/${USER}/job_${SLURM_JOB_ID}"
declare -xr TMPDIR="${LOCAL_SCRATCH_DIR}"

declare -xr SYSTEM_NAME='expanse'

declare -xr SPACK_VERSION='0.17.3'
declare -xr SPACK_INSTANCE_NAME='gpu'
declare -xr SPACK_INSTANCE_VERSION='b'
declare -xr SPACK_INSTANCE_DIR="/cm/shared/apps/spack/${SPACK_VERSION}/${SPACK_INSTANCE_NAME}/${SPACK_INSTANCE_VERSION}"

declare -xr SLURM_JOB_SCRIPT="$(scontrol show job ${SLURM_JOB_ID} | awk -F= '/Command=/{print $2}')"
declare -xr SLURM_JOB_MD5SUM="$(md5sum ${SLURM_JOB_SCRIPT})"

declare -xr SCHEDULER_MODULE='slurm'
declare -xr COMPILER_MODULE='gcc/10.2.0'
declare -xr MPI_MODULE='openmpi/4.1.3'
declare -xr CUDA_MODULE='cuda/11.2.2'

echo "${UNIX_TIME} ${SLURM_JOB_ID} ${SLURM_JOB_MD5SUM} ${SLURM_JOB_DEPENDENCY}" 
echo ""

cat "${SLURM_JOB_SCRIPT}"

module purge
module load "${SCHEDULER_MODULE}"
. "${SPACK_INSTANCE_DIR}/share/spack/setup-env.sh"
module use "${SPACK_ROOT}/share/spack/lmod/linux-rocky8-x86_64/Core"
module load "${COMPILER_MODULE}"
module load "${MPI_MODULE}"
module load "${CUDA_MODULE}"
module list

# A conflict was triggered
#  condition(2667)
#  condition(2883)
#  condition(2884)
#  conflict("lammps",2883,2884)
#  no version satisfies the given constraints
#  root("lammps")
#  variant_condition(2667,"lammps","meam")
#  variant_set("lammps","meam","True")
#  version_satisfies("lammps","20181212:","20210310")
#  version_satisfies("lammps","20210310")

#1 error found in build log:
#     327    -- <<< FFT settings >>>
#     328    -- Primary FFT lib:  FFTW3
#     329    -- Using double precision FFTs
#     330    -- Using non-threaded FFTs
#     331    -- Kokkos FFT: cuFFT
#     332    -- Configuring done
#  >> 333    CMake Error: The following variables are used in this project, but 
#            they are set to NOTFOUND.
#     334    Please set them or make sure they are set and tested correctly in t
#            he CMake files:
#     335    CUDA_CUDA_LIBRARY (ADVANCED)
#     336        linked by target "nvc_get_devices" in directory /tmp/mkandes/sp
#            ack-stage/spack-stage-lammps-20210310-il34zpxttkmdye5sk2shvxlxolpi6
#            tux/spack-src/cmake
#     337        linked by target "gpu" in directory /tmp/mkandes/spack-stage/sp
#            ack-stage-lammps-20210310-il34zpxttkmdye5sk2shvxlxolpi6tux/spack-sr
#            c/cmake
#     338    
#     339    -- Generating done

# FIX: https://github.com/floydhub/dl-docker/pull/48
declare -xr CUDA_CUDA_LIBRARY='/cm/local/apps/cuda/libs/current/lib64'
declare -xr CMAKE_LIBRARY_PATH="${CUDA_CUDA_LIBRARY}"

# >> 6059    /home/mkandes/cm/shared/apps/spack/0.17.3/gpu/opt/spack/linux-rock
#             y8-cascadelake/gcc-10.2.0/kokkos-3.4.01-hkmc634lei4z23r7tvrhaag3ho
#             wgnixn/include/Cuda/Kokkos_Cuda_Parallel.hpp(464): error: calling 
#             a __host__ function("LAMMPS_NS::MinKokkos::force_clear()::[lambda(
#             int) (instance 1)]::operator ()(int) const") from a __device__ fun
#             ction("Kokkos::Impl::ParallelFor< ::LAMMPS_NS::MinKokkos::force_cl
#             ear()   ::[lambda(int) (instance 1)],  ::Kokkos::RangePolicy< ::Ko
#             kkos::Cuda > ,  ::Kokkos::Cuda> ::exec_range<void>  const") is not
#              allowed

#  condition(2731)
#  condition(2995)
#  condition(5719)
#  dependency_condition(2995,"lammps","python")
#  dependency_type(2995,"link")
#  hash("plumed","n2o2udniskgvoaacgn66fbladjkjtcai")
#  imposed_constraint("n2o2udniskgvoaacgn66fbladjkjtcai","hash","python","uasyy5n4yauliglzcgk27zmfa3ltehdy")
#  root("lammps")
#  variant_condition(2731,"lammps","python")
#  variant_condition(5719,"python","optimizations")
#  variant_set("lammps","python","True")
#  variant_set("python","optimizations","False")

# 2 errors found in build log:
#     53    -- Found CURL: /usr/lib64/libcurl.so (found version "7.61.1")
#     54    -- Checking for module 'libzstd>=1.4'
#     55    --   Found libzstd, version 1.4.4
#     56    -- Looking for C++ include cmath
#     57    -- Looking for C++ include cmath - found
#     58    -- Checking external potential C_10_10.mesocnt from https://download
#           .lammps.org/potentials
#  >> 59    CMake Error at Modules/LAMMPSUtils.cmake:101 (file):
#     60      file DOWNLOAD HASH mismatch
#     61    
#     62        for file: [/tmp/mkandes/spack-stage/spack-stage-lammps-20210310-
#           oi3mpq35ufuhx6mnichykaxhdhdrfhf4/spack-build-oi3mpq3/C_10_10.mesocnt
#           ]
#     63          expected hash: [028de73ec828b7830d762702eda571c1]
#     64            actual hash: [d41d8cd98f00b204e9800998ecf8427e]
#     65                 status: [6;"Couldn't resolve host name"]
#     66    
#     67    Call Stack (most recent call first):
#     68      CMakeLists.txt:428 (FetchPotentials)
#     69    
#     70    
#     71    -- Checking external potential TABTP_10_10.mesont from https://downl
#           oad.lammps.org/potentials
#  >> 72    CMake Error at Modules/LAMMPSUtils.cmake:101 (file):
#     73      file DOWNLOAD HASH mismatch
#     74    
#     75        for file: [/tmp/mkandes/spack-stage/spack-stage-lammps-20210310-
#           oi3mpq35ufuhx6mnichykaxhdhdrfhf4/spack-build-oi3mpq3/TABTP_10_10.mes
#           ont]
#     76          expected hash: [744a739da49ad5e78492c1fc9fd9f8c1]
#     77            actual hash: [d41d8cd98f00b204e9800998ecf8427e]
#     78                 status: [6;"Couldn't resolve host name"]

# >> 6059    /home/mkandes/cm/shared/apps/spack/0.17.3/gpu/opt/spack/linux-rock
#             y8-cascadelake/gcc-10.2.0/kokkos-3.4.01-owdzetzbm2fsbcpt3vdiv2lvvl
#             pxqwo7/include/Cuda/Kokkos_Cuda_Parallel.hpp(488): error: calling 
#             a __host__ function("LAMMPS_NS::MinKokkos::force_clear()::[lambda(
#             int) (instance 1)]::operator () const") from a __device__ function
#             ("Kokkos::Impl::ParallelFor< ::LAMMPS_NS::MinKokkos::force_clear()
#                ::[lambda(int) (instance 1)],  ::Kokkos::RangePolicy< ::Kokkos:
#             :Cuda > ,  ::Kokkos::Cuda> ::operator () const") is not allowed
#     6060    
#  >> 6061    /home/mkandes/cm/shared/apps/spack/0.17.3/gpu/opt/spack/linux-rock
#             y8-cascadelake/gcc-10.2.0/kokkos-3.4.01-owdzetzbm2fsbcpt3vdiv2lvvl
#             pxqwo7/include/Cuda/Kokkos_Cuda_Parallel.hpp(488): error: identifi
#             er "LAMMPS_NS::MinKokkos::force_clear()::[lambda(int) (instance 1)
#             ]::operator () const" is undefined in device code

declare -xr SPACK_PACKAGE='lammps@20210310'
declare -xr SPACK_COMPILER='[email protected]'
declare -xr SPACK_VARIANTS='+asphere +body +class2 +colloid +compress +coreshell +cuda cuda_arch=70 +dipole ~exceptions +ffmpeg +granular ~ipo +jpeg +kim ~kokkos +kspace +latte +lib +manybody +mc ~meam +misc +mliap +molecule +mpi +mpiio ~opencl +openmp +opt +peri +png +poems +python +qeq +replica +rigid +shock +snap +spin +srd ~user-adios ~user-atc ~user-awpmd ~user-bocs ~user-cgsdk ~user-colvars ~user-diffraction ~user-dpd +user-drude ~user-eff ~user-fep ~user-h5md ~user-lb ~user-manifold ~user-meamc ~user-mesodpd ~user-mesont ~user-mgpt ~user-misc ~user-mofff ~user-netcdf ~user-omp ~user-phonon ~user-plumed ~user-ptm ~user-qtb ~user-reaction ~user-reaxc ~user-sdpd ~user-smd ~user-smtbq ~user-sph ~user-tally ~user-uef ~user-yaff +voronoi'
declare -xr SPACK_DEPENDENCIES="^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} ~ilp64 threads=none) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} ~mpi ~openmp) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER})" #^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} ^kokkos-nvcc-wrapper ~mpi) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} +mpi ^[email protected])" 
declare -xr SPACK_SPEC="${SPACK_PACKAGE} % ${SPACK_COMPILER} ${SPACK_VARIANTS} ${SPACK_DEPENDENCIES}"

printenv

spack config get compilers
spack config get config  
spack config get mirrors
spack config get modules
spack config get packages
spack config get repos
spack config get upstreams

time -p spack spec --long --namespaces --types lammps@20210310 % [email protected] +asphere +body +class2 +colloid +compress +coreshell +cuda cuda_arch=70 +dipole ~exceptions +ffmpeg +granular ~ipo +jpeg +kim ~kokkos +kspace +latte +lib +manybody +mc ~meam +misc +mliap +molecule +mpi +mpiio ~opencl +openmp +opt +peri +png +poems +python +qeq +replica +rigid +shock +snap +spin +srd ~user-adios ~user-atc ~user-awpmd ~user-bocs ~user-cgsdk ~user-colvars ~user-diffraction ~user-dpd +user-drude ~user-eff ~user-fep ~user-h5md ~user-lb ~user-manifold ~user-meamc ~user-mesodpd ~user-mesont ~user-mgpt ~user-misc ~user-mofff ~user-netcdf ~user-omp ~user-phonon ~user-plumed ~user-ptm ~user-qtb ~user-reaction ~user-reaxc ~user-sdpd ~user-smd ~user-smtbq ~user-sph ~user-tally ~user-uef ~user-yaff +voronoi "${SPACK_DEPENDENCIES}"
if [[ "${?}" -ne 0 ]]; then
  echo 'ERROR: spack concretization failed.'
  exit 1
fi

time -p spack install --jobs "${SLURM_CPUS_PER_TASK}" --fail-fast --yes-to-all lammps@20210310 % [email protected] +asphere +body +class2 +colloid +compress +coreshell +cuda cuda_arch=70 +dipole ~exceptions +ffmpeg +granular ~ipo +jpeg +kim ~kokkos +kspace +latte +lib +manybody +mc ~meam +misc +mliap +molecule +mpi +mpiio ~opencl +openmp +opt +peri +png +poems +python +qeq +replica +rigid +shock +snap +spin +srd ~user-adios ~user-atc ~user-awpmd ~user-bocs ~user-cgsdk ~user-colvars ~user-diffraction ~user-dpd +user-drude ~user-eff ~user-fep ~user-h5md ~user-lb ~user-manifold ~user-meamc ~user-mesodpd ~user-mesont ~user-mgpt ~user-misc ~user-mofff ~user-netcdf ~user-omp ~user-phonon ~user-plumed ~user-ptm ~user-qtb ~user-reaction ~user-reaxc ~user-sdpd ~user-smd ~user-smtbq ~user-sph ~user-tally ~user-uef ~user-yaff +voronoi "${SPACK_DEPENDENCIES}"
if [[ "${?}" -ne 0 ]]; then
  echo 'ERROR: spack install failed.'
  exit 1
fi

spack module lmod refresh --delete-tree -y

#sbatch --dependency="afterok:${SLURM_JOB_ID}" ''

sleep 30
[spack_gpu@exp-15-57 [email protected]]$

@mkandes
Copy link
Member

mkandes commented Sep 7, 2023

TSCC2 has a deployed LAMMPS build.

[mkandes@login1 ~]$ module spider lammps

----------------------------------------------------------------------------
  lammps:
----------------------------------------------------------------------------
     Versions:
        lammps/20210310-poxcpz2
        lammps/20210310-pwvznqm

----------------------------------------------------------------------------
  For detailed information about a specific "lammps" package (including how to load the modules) use the module's full name.
  Note that names that have a trailing (E) are extensions provided by other modules.
  For example:

     $ module spider lammps/20210310-pwvznqm
----------------------------------------------------------------------------

 

[mkandes@login1 ~]$ module spider lammps/20210310-poxcpz2

----------------------------------------------------------------------------
  lammps: lammps/20210310-poxcpz2
----------------------------------------------------------------------------

    You will need to load all module(s) on any one of the lines below before the "lammps/20210310-poxcpz2" module is available to load.

      shared  cpu/0.17.3  gcc/10.2.0-2ml3m2l  mvapich2/2.3.7-txbe2wo
 
    Help:
      LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel
      Simulator. This package uses patch releases, not stable release. See
      https://github.com/spack/spack/pull/5342 for a detailed discussion.


 

[mkandes@login1 ~]$ module spider lammps/20210310-pwvznqm

----------------------------------------------------------------------------
  lammps: lammps/20210310-pwvznqm
----------------------------------------------------------------------------

    You will need to load all module(s) on any one of the lines below before the "lammps/20210310-pwvznqm" module is available to load.

      shared  gpu/0.17.3  gcc/10.2.0-mqbpsxf  intel-mpi/2019.10.317-kgj4a7v
 
    Help:
      LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel
      Simulator. This package uses patch releases, not stable release. See
      https://github.com/spack/spack/pull/5342 for a detailed discussion.

[mkandes@login1 ~]$

Its complete spec is as follows:

[mkandes@login1 ~]$ . /cm/shared/apps/spack/0.17.3/gpu/share/spack/setup-env.sh 
[mkandes@login1 ~]$ spack --version
0.17.3
[mkandes@login1 ~]$ spack find -lvdN lammps
==> 1 installed package
-- linux-rocky9-broadwell / [email protected] --------------------------
pwvznqm sdsc.lammps@20210310+asphere+body+class2+colloid+compress+coreshell+cuda~cuda_mps+dipole~exceptions+ffmpeg+granular~ipo+jpeg+kim+kokkos+kspace~latte+lib+manybody+mc~meam+misc+mliap+molecule+mpi+mpiio~opencl+openmp+opt+peri+png+poems+python+qeq+replica+rigid+shock+snap+spin+srd~user-adios+user-atc+user-awpmd+user-bocs+user-cgsdk+user-colvars+user-diffraction+user-dpd+user-drude+user-eff+user-fep~user-h5md+user-lb+user-manifold+user-meamc+user-mesodpd+user-mesont+user-mgpt+user-misc+user-mofff~user-netcdf~user-omp+user-phonon+user-plumed+user-ptm+user-qtb+user-reaction+user-reaxc+user-sdpd+user-smd+user-smtbq+user-sph+user-tally+user-uef+user-yaff+voronoi build_type=RelWithDebInfo cuda_arch=60,75,80,86
bgeuj6j     [email protected]~dev
egosf32         [email protected]~python
r7u3li4             [email protected] libs=shared,static
bnvpjjz             [email protected]~pic libs=shared,static
uezvqlh             [email protected]+optimize+pic+shared
7bq2mgj     [email protected]~gssapi~ldap~libidn2~librtmp~libssh~libssh2~nghttp2 tls=openssl
pu3y3pq         [email protected]~docs certs=system
gs54bbj     [email protected]~ipo build_type=RelWithDebInfo
rmrqnwu     [email protected]~X~avresample+bzlib~drawtext+gpl~libaom~libmp3lame~libopenjpeg~libopus~libsnappy~libspeex~libssh~libvorbis~libvpx~libwebp~libzmq~lzma~nonfree~openssl~sdl2+shared+version3
ucsdh7i         [email protected]~python
dh4hxop         [email protected]~debug~pic+shared
x5xjlcl         [email protected]
ej2lqo7     [email protected]~mpi~openmp~pfft_patches precision=double,float
kgj4a7v     [email protected]
ag2yoae     [email protected]~ipo build_type=RelWithDebInfo
6i4zxz3     [email protected]~aggressive_vectorization~compiler_warnings+cuda+cuda_constexpr+cuda_lambda+cuda_ldg_intrinsic~cuda_relocatable_device_code~cuda_uvm~debug~debug_bounds_check~debug_dualview_modify_check~deprecated_code~examples~explicit_instantiation~hpx~hpx_async_dispatch~hwloc~ipo~memkind~numactl~openmp+pic+profiling~profiling_load_print~pthread~qthread~rocm+serial+shared~sycl~tests~tuning+wrapper amdgpu_target=none build_type=RelWithDebInfo cuda_arch=60 std=14
3y6tgvg         [email protected]+mpi
7rlpvrn     [email protected]
7tywqyc     [email protected]
xmdkrna     [email protected]~bignuma~consistent_fpcsr~ilp64+locking+pic+shared threads=none
atk3am5     [email protected]+gsl+mpi+shared arrayfire=none optional_modules=all
rl4qjqe         [email protected]~external-cblas
inx66sv     [email protected]+bz2+ctypes+dbm~debug+libxml2+lzma~nis+optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3+ssl~tix~tkinter~ucs4+uuid+zlib patches=0d98e93189bc278fbc37a50ed7f183bd8aaf249a8e1670a465f0db6bb4f8cf87,4c2457325f2b608b1b6a2c63087df8c26e07db3e3d493caf36a56f0ecf6fb768,f2fd060afc4b4618fe8104c4c5d771f36dc55b1db5a4623785a4ea707ec72fb4
ppow2vp         [email protected]+libbsd
aur3lqv             [email protected]
vmorxax                 [email protected]
6ukzrku         [email protected]
wndvzcw             [email protected]
ovtvq7m                 [email protected]~symlinks+termlib abi=none
r6ujx4y         [email protected]+bzip2+curses+git~libunistring+libxml2+tar+xz
lo6rmhe             [email protected]
rxt5der         [email protected] patches=26f26c6f29a7ce9bf370ad3ab2610f99365b4bdd7b82e7c31df41a3370d685c0
v36qrof         [email protected]+column_metadata+fts~functions~rtree
nj7akxi         [email protected]
76sosec     [email protected]+pic

[mkandes@login1 ~]$

@mkandes
Copy link
Member

mkandes commented Sep 7, 2023

It is difficult to tell at a glance what the custom changes are in the tscc custom spack repo for each of these packages here. We might as compare their differences with the builtins repo to better understand the changes here.

[mkandes@login1 lammps]$ diff package.py ../../../../builtin/packages/lammps/package.py 
8d7
< import os
25d23
< #   version('20210311', sha256='d1b4b3860f4bd6aed53e09956cfcf3ce3c3866b83e681898ee945a02bebe1fd9',url='file://{0}/lammps-11Mar2021.tar.gz'.format(os.getcwd()))
131d128
<     depends_on('curl')
247d243
< 
253,268d248
< 
<     @run_before('cmake')
< 
<     def add_potential_files(self):
<         copy(join_path(os.path.dirname(self.module.__file__),'C_10_10.mesocnt'),'potentials')
<         copy(join_path(os.path.dirname(self.module.__file__),'TABTP_10_10.mesont'),'potentials')
< 
<     @run_before('build')
< 
<     def remove_libs(self):
<         with working_dir(join_path('..','spack-build-'+self.spec.dag_hash(7))):
<              filter_file('OpenMP_pthread_LIBRARY:FILEPATH=/lib64/libpthread.a','OpenMP_pthread_LIBRARY:FILEPATH=/usr/lib64/libpthread.so.0','CMakeCache.txt')
<              filter_file('MPI_pthread_LIBRARY:FILEPATH=/lib64/libpthread.a','MPI_pthread_LIBRARY:FILEPATH=/usr/lib64/libpthread.so.0','CMakeCache.txt')
<              filter_file('MPI_dl_LIBRARY:FILEPATH=/lib64/libdl.a','MPI_dl_LIBRARY:FILEPATH=/usr/lib64/libdl.so.2','CMakeCache.txt')
<              filter_file('CUDA_rt_LIBRARY:FILEPATH=/usr/lib64/librt.a','CUDA_rt_LIBRARY:FILEPATH=/usr/lib64/librt.so.1','CMakeCache.txt')
<              filter_file('MPI_rt_LIBRARY:FILEPATH=/lib64/librt.a','MPI_rt_LIBRARY:FILEPATH=/usr/lib64/librt.so.1','CMakeCache.txt')
[mkandes@login1 lammps]$
[mkandes@login1 lammps]$ diff 660.patch ../../../../builtin/packages/lammps/660.patch
[mkandes@login1 lammps]$ diff C_10_10.mesocnt ../../../../builtin/packages/lammps/C_10_10.mesocnt
diff: ../../../../builtin/packages/lammps/C_10_10.mesocnt: No such file or directory
[mkandes@login1 lammps]$ diff TABTP_10_10.mesont ../../../../builtin/packages/lammps/TABTP_10_10.mesont
diff: ../../../../builtin/packages/lammps/TABTP_10_10.mesont: No such file or directory
[mkandes@login1 lammps]$ diff lib.patch ../../../../builtin/packages/lammps/lib.patch
[mkandes@login1 lammps]$ diff Makefile.inc ../../../../builtin/packages/lammps/Makefile.inc
[mkandes@login1 lammps]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

[mkandes@login1 kokkos]$ pwd
/cm/shared/apps/spack/0.17.3/gpu/var/spack/repos/sdsc/tscc/packages/kokkos
[mkandes@login1 kokkos]$ diff package.py ../../../../builtin/packages/kokkos/package.py 
191d190
<     
211,212d209
<         with working_dir('cmake'): 
<             filter_file('SET\(LIBDL_DEFAULT On\)','SET(LIBDL_DEFAULT Off)','kokkos_tpls.cmake')
[mkandes@login1 kokkos]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

I find no difference between builtin and tscc openblas package?

[mkandes@login1 kokkos]$ cd ../openblas/
[mkandes@login1 openblas]$ ls
0001-use-usr-bin-env-perl.patch  openblas_icc_fortran2.patch
lapack-0.3.9-xerbl.patch         openblas_icc_fortran.patch
make.patch                       openblas_icc_openmp.patch
openblas0.2.19.diff              openblas_icc.patch
openblas-0.3.2-cmake.patch       package.py
openblas-0.3.8-darwin.patch      power8.patch
openblas_appleclang11.patch      __pycache__
openblas_fujitsu2.patch          test_cblas_dgemm.c
openblas_fujitsu.patch           test_cblas_dgemm.output
openblas_fujitsu_v0.3.11.patch
[mkandes@login1 openblas]$ ls ../../../../builtin/packages/openblas/
0001-use-usr-bin-env-perl.patch  openblas_icc_fortran2.patch
lapack-0.3.9-xerbl.patch         openblas_icc_fortran.patch
make.patch                       openblas_icc_openmp.patch
openblas0.2.19.diff              openblas_icc.patch
openblas-0.3.2-cmake.patch       package.py
openblas-0.3.8-darwin.patch      power8.patch
openblas_appleclang11.patch      __pycache__
openblas_fujitsu2.patch          test_cblas_dgemm.c
openblas_fujitsu.patch           test_cblas_dgemm.output
openblas_fujitsu_v0.3.11.patch
[mkandes@login1 openblas]$ diff 0001-use-usr-bin-env-perl.patch ../../../../builtin/packages/openblas/0001-use-usr-bin-env-perl.patch
[mkandes@login1 openblas]$ diff make.patch ../../../../builtin/packages/openblas/make.patch
[mkandes@login1 openblas]$ diff openblas0.2.19.diff ../../../../builtin/packages/openblas/openblas0.2.19.diff
[mkandes@login1 openblas]$ diff openblas-0.3.2-cmake.patch ../../../../builtin/packages/openblas/openblas-0.3.2-cmake.patch
[mkandes@login1 openblas]$ diff openblas-0.3.8-darwin.patch ../../../../builtin/packages/openblas/openblas-0.3.8-darwin.patch
[mkandes@login1 openblas]$ diff openblas-0.3.8-darwin.patch ../../../../builtin/packages/openblas/openblas_diff 
[mkandes@login1 openblas]$ diff openblas_appleclang11.patch ../../../../builtin/packages/openblas/openblas_appleclang11.patch
[mkandes@login1 openblas]$ diff openblas_fujitsu2.patch ../../../../builtin/packages/openblas/openblas_fujitsu2.patch
[mkandes@login1 openblas]$ diff openblas_fujitsu.patch ../../../../builtin/packages/openblas/openblas_fujitsu.patch
[mkandes@login1 openblas]$ diff openblas_fujitsu_v0.3.11.patch ../../../../builtin/packages/openblas/openblas_fujitsu_v0.3.11.patch
[mkandes@login1 openblas]$ diff openblas_icc_fortran2.patch ../../../../builtin/packages/openblas/openblas_icc_fortran2.patch
[mkandes@login1 openblas]$ diff openblas_icc_fortran.patch ../../../../builtin/packages/openblas/openblas_icc_fortran.patch
[mkandes@login1 openblas]$ diff openblas_icc_openmp.patch ../../../../builtin/packages/openblas/openblas_icc_openmp.patch
[mkandes@login1 openblas]$ diff openblas_icc.patch ../../../../builtin/packages/openblas/openblas_icc.patch
[mkandes@login1 openblas]$ diff package.py ../../../../builtin/packages/openblas/package.py
[mkandes@login1 openblas]$ diff power8.patch ../../../../builtin/packages/openblas/power8.patch
[mkandes@login1 openblas]$ diff test_cblas_dgemm.c ../../../../builtin/packages/openblas/test_cblas_dgemm.c
[mkandes@login1 openblas]$ diff test_cblas_dgemm.output ../../../../builtin/packages/openblas/test_cblas_dgemm.output
[mkandes@login1 openblas]$ diff lapack-0.3.9-xerbl.patch ../../../../builtin/packages/openblas/lapack-0.3.9-xerbl.patch
[mkandes@login1 openblas]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

[mkandes@login1 plumed]$ diff package.py ../../../../builtin/packages/plumed/package.py 
136,142d135
<     @run_before('build')
<     def filter_out(self):
<         filter_file('LIBS=-ldl -mt_mpi','LIBS=','Makefile.conf')
<         filter_file('-ldl',' ','Makefile.conf')
<         filter_file('-ldl -mt_mpi',' ','config.status')
< 
< 
[mkandes@login1 plumed]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

[mkandes@login1 gsl]$ pwd
/cm/shared/apps/spack/0.17.3/gpu/var/spack/repos/sdsc/tscc/packages/gsl
[mkandes@login1 gsl]$ ls
gsl-2.3-cblas.patch  gsl-2.6-cblas.patch  package.py  __pycache__  tests
[mkandes@login1 gsl]$ ls ../../../../builtin/packages/gsl
gsl-2.3-cblas.patch  gsl-2.6-cblas.patch  package.py  __pycache__
[mkandes@login1 gsl]$ diff gsl-2.3-cblas.patch ../../../../builtin/packages/gsl/gsl-2.3-cblas.patch
[mkandes@login1 gsl]$ diff gsl-2.6-cblas.patch ../../../../builtin/packages/gsl/gsl-2.6-cblas.patch
[mkandes@login1 gsl]$ diff package.py ../../../../builtin/packages/gsl/package.py
8d7
< import os
45,48d43
<     def setup_build_environment(self, env):
<         if '%intel' in self.spec:
<             env.set('CFLAGS','-fp-model=strict')
< 
64,82d58
< 
< #   @on_package_attributes(run_tests=True)
< #   @run_after('install')
< 
< #   def test(self):
< #       make('check')
< #       with open(join_path(os.path.dirname(self.module.__file__),'tests')) as f:
< #           tests = f.read().splitlines()
< #       num_tests=len(tests)
< #       good_tests = 0
< #       for test in tests:
< #           output=Executable(join_path(test,'test'))(output=str,error=str)
< #           if 'Completed' in output:
< #                good_tests += 1
< #       with open('/tmp/gsl.output','w') as fp:
< #           if good_tests == num_tests:
< #               print('PASSED',file=fp)
< #           else:
< #               print('FAILED',file=fp)
[mkandes@login1 gsl]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

Attempting to change lammps +cuda in expanse/0.17.3/gpu/b to use intel-mpi as is currently in use on TSCC2. Also need to build and test new versions of plumed and kokkos as well.

[spack_gpu@exp-15-57 [email protected]]$ ls -lahtr
total 814K
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.2K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.7K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.4K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  3.2K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.4K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.6K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.2K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.2K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.2K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.4K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack   38K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.4K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.8K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  3.1K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.2K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:35 [email protected]
-rw-r--r-- 1 spack_gpu spack   25K Apr 23 12:36 [email protected]
-rw-r--r-- 1 spack_gpu spack   25K Apr 23 12:37 [email protected]
-rw-r--r-- 1 spack_gpu spack   24K Apr 23 12:38 [email protected]
-rw-r--r-- 1 spack_gpu spack   26K Apr 23 12:40 [email protected]
-rw-r--r-- 1 spack_gpu spack   24K Apr 23 12:44 [email protected]
-rw-r--r-- 1 spack_gpu spack   31K Apr 23 12:47 [email protected]
-rw-r--r-- 1 spack_gpu spack   27K Apr 23 12:48 [email protected]
-rw-r--r-- 1 spack_gpu spack   28K Apr 23 12:53 [email protected]
-rw-r--r-- 1 spack_gpu spack   33K Apr 23 12:54 [email protected]
-rw-r--r-- 1 spack_gpu spack   29K Apr 23 12:56 [email protected]
-rw-r--r-- 1 spack_gpu spack   33K Apr 23 12:57 [email protected]
-rw-r--r-- 1 spack_gpu spack   29K Apr 23 12:59 [email protected]
-rw-r--r-- 1 spack_gpu spack   34K Apr 23 13:00 [email protected]
-rw-r--r-- 1 spack_gpu spack   40K Apr 23 13:03 [email protected]
-rw-r--r-- 1 spack_gpu spack   41K Apr 23 13:05 [email protected]
-rw-r--r-- 1 spack_gpu spack   26K Apr 23 13:06 [email protected]
-rw-r--r-- 1 spack_gpu spack   38K Apr 23 13:11 [email protected]
-rw-r--r-- 1 spack_gpu spack   31K Apr 23 13:13 [email protected]
-rw-r--r-- 1 spack_gpu spack   31K Apr 23 13:17 [email protected]
-rw-r--r-- 1 spack_gpu spack   37K Apr 23 13:19 [email protected]
-rw-r--r-- 1 spack_gpu spack   35K Apr 23 13:20 [email protected]
-rw-r--r-- 1 spack_gpu spack   56K Apr 23 13:21 [email protected]
-rw-r--r-- 1 spack_gpu use300 4.5K Aug 27 13:15 [email protected]
drwxr-sr-x 5 spack_gpu spack   176 Sep  7 12:33 ..
-rw-r--r-- 1 spack_gpu spack  9.1K Sep  7 20:03 [email protected]
-rw-r--r-- 1 spack_gpu spack  3.4K Sep  7 20:08 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.6K Sep  7 20:08 [email protected]
drwxr-sr-x 2 spack_gpu spack    49 Sep  7 20:08 .
[spack_gpu@exp-15-57 [email protected]]$ cat [email protected] 
#!/usr/bin/env bash

#SBATCH [email protected]
#SBATCH --account=use300
#SBATCH --reservation=root_73
#SBATCH --partition=ind-gpu-shared
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=10
#SBATCH --mem=93G
#SBATCH --gpus=1
#SBATCH --time=00:30:00
#SBATCH --output=%x.o%j.%N

declare -xr LOCAL_TIME="$(date +'%Y%m%dT%H%M%S%z')"
declare -xir UNIX_TIME="$(date +'%s')"

declare -xr LOCAL_SCRATCH_DIR="/scratch/${USER}/job_${SLURM_JOB_ID}"
declare -xr TMPDIR="${LOCAL_SCRATCH_DIR}"

declare -xr SYSTEM_NAME='expanse'

declare -xr SPACK_VERSION='0.17.3'
declare -xr SPACK_INSTANCE_NAME='gpu'
declare -xr SPACK_INSTANCE_VERSION='b'
declare -xr SPACK_INSTANCE_DIR="/cm/shared/apps/spack/${SPACK_VERSION}/${SPACK_INSTANCE_NAME}/${SPACK_INSTANCE_VERSION}"

declare -xr SLURM_JOB_SCRIPT="$(scontrol show job ${SLURM_JOB_ID} | awk -F= '/Command=/{print $2}')"
declare -xr SLURM_JOB_MD5SUM="$(md5sum ${SLURM_JOB_SCRIPT})"

declare -xr SCHEDULER_MODULE='slurm'

echo "${UNIX_TIME} ${SLURM_JOB_ID} ${SLURM_JOB_MD5SUM} ${SLURM_JOB_DEPENDENCY}" 
echo ""

cat "${SLURM_JOB_SCRIPT}"

module purge
module load "${SCHEDULER_MODULE}"
module list
. "${SPACK_INSTANCE_DIR}/share/spack/setup-env.sh"

declare -xr SPACK_PACKAGE='[email protected]'
declare -xr SPACK_COMPILER='[email protected]'
declare -xr SPACK_VARIANTS='+gsl +mpi +shared'
declare -xr SPACK_DEPENDENCIES="^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} ~ilp64 threads=none) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER})" #^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} +cuda cuda_arch=70,80)"
declare -xr SPACK_SPEC="${SPACK_PACKAGE} % ${SPACK_COMPILER} ${SPACK_VARIANTS} ${SPACK_DEPENDENCIES}"

printenv

spack config get compilers
spack config get config  
spack config get mirrors
spack config get modules
spack config get packages
spack config get repos
spack config get upstreams

time -p spack spec --long --namespaces --types --reuse $(echo "${SPACK_SPEC}")
if [[ "${?}" -ne 0 ]]; then
  echo 'ERROR: spack concretization failed.'
fi

time -p spack install -v --jobs "${SLURM_CPUS_PER_TASK}" --fail-fast --yes-to-all --reuse $(echo "${SPACK_SPEC}")
if [[ "${?}" -ne 0 ]]; then
  echo 'ERROR: spack install failed.'
  exit 1
fi

spack module lmod refresh --delete-tree -y

#sbatch --dependency="afterok:${SLURM_JOB_ID}" '[email protected]'

sleep 30
[spack_gpu@exp-15-57 [email protected]]$ cat [email protected] 
#!/usr/bin/env bash

#SBATCH [email protected]
#SBATCH --account=use300
#SBATCH --reservation=root_73
#SBATCH --partition=ind-gpu-shared
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=10
#SBATCH --mem=93G
#SBATCH --gpus=1
#SBATCH --time=00:30:00
#SBATCH --output=%x.o%j.%N

declare -xr LOCAL_TIME="$(date +'%Y%m%dT%H%M%S%z')"
declare -xir UNIX_TIME="$(date +'%s')"

declare -xr LOCAL_SCRATCH_DIR="/scratch/${USER}/job_${SLURM_JOB_ID}"
declare -xr TMPDIR="${LOCAL_SCRATCH_DIR}"

declare -xr SYSTEM_NAME='expanse'

declare -xr SPACK_VERSION='0.17.3'
declare -xr SPACK_INSTANCE_NAME='gpu'
declare -xr SPACK_INSTANCE_VERSION='b'
declare -xr SPACK_INSTANCE_DIR="/cm/shared/apps/spack/${SPACK_VERSION}/${SPACK_INSTANCE_NAME}/${SPACK_INSTANCE_VERSION}"

declare -xr SLURM_JOB_SCRIPT="$(scontrol show job ${SLURM_JOB_ID} | awk -F= '/Command=/{print $2}')"
declare -xr SLURM_JOB_MD5SUM="$(md5sum ${SLURM_JOB_SCRIPT})"

declare -xr SCHEDULER_MODULE='slurm'

echo "${UNIX_TIME} ${SLURM_JOB_ID} ${SLURM_JOB_MD5SUM} ${SLURM_JOB_DEPENDENCY}" 
echo ""

cat "${SLURM_JOB_SCRIPT}"

module purge
module load "${SCHEDULER_MODULE}"
module list
. "${SPACK_INSTANCE_DIR}/share/spack/setup-env.sh"

# +cuda_relocatable_device_code not working; would you need cuda +dev for static libraries to be installed? or is this in driver packages?
# >> 12    CMake Error at cmake/kokkos_enable_options.cmake:124 (MESSAGE):
#     13      Relocatable device code requires static libraries.

# multiple cuda_arch not working together? does KOKKOS not support multiple cuda_arch in same build?
#>> 13    CMake Error at cmake/kokkos_arch.cmake:383 (MESSAGE):
#     14      Multiple GPU architectures given! Already have VOLTA70, but trying
#            to add
#     15      AMPERE80.  If you are re-running CMake, try clearing the cache and
#            running
#     16      again

declare -xr SPACK_PACKAGE='[email protected]'
declare -xr SPACK_COMPILER='[email protected]'
declare -xr SPACK_VARIANTS='~aggressive_vectorization ~compiler_warnings +cuda cuda_arch=70 +cuda_constexpr +cuda_lambda +cuda_ldg_intrinsic ~cuda_relocatable_device_code ~cuda_uvm ~debug ~debug_bounds_check ~debug_dualview_modify_check ~deprecated_code ~examples ~explicit_instantiation ~hpx ~hpx_async_dispatch ~hwloc ~ipo ~memkind ~numactl ~openmp +pic +profiling ~profiling_load_print ~pthread ~qthread ~rocm +serial +shared ~sycl ~tests ~tuning +wrapper'
declare -xr SPACK_DEPENDENCIES="^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER})"
declare -xr SPACK_SPEC="${SPACK_PACKAGE} % ${SPACK_COMPILER} ${SPACK_VARIANTS} ${SPACK_DEPENDENCIES}"

printenv

spack config get compilers  
spack config get config  
spack config get mirrors
spack config get modules
spack config get packages
spack config get repos
spack config get upstreams

time -p spack spec --long --namespaces --types --reuse $(echo "${SPACK_SPEC}")
if [[ "${?}" -ne 0 ]]; then
  echo 'ERROR: spack concretization failed.'
fi

time -p spack install -v --jobs "${SLURM_CPUS_PER_TASK}" --fail-fast --yes-to-all --reuse $(echo "${SPACK_SPEC}")
if [[ "${?}" -ne 0 ]]; then
  echo 'ERROR: spack install failed.'
  exit 1
fi

spack module lmod refresh --delete-tree -y

#sbatch --dependency="afterok:${SLURM_JOB_ID}" '[email protected]'

sleep 30
[spack_gpu@exp-15-57 [email protected]]$ cat [email protected] 
#!/usr/bin/env bash

#SBATCH --job-name=lammps@20210310
#SBATCH --account=use300
#SBATCH --reservation=root_73
#SBATCH --partition=ind-gpu-shared
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=10
#SBATCH --mem=93G
#SBATCH --gpus=1
#SBATCH --time=24:00:00
#SBATCH --output=%x.o%j.%N

declare -xr LOCAL_TIME="$(date +'%Y%m%dT%H%M%S%z')"
declare -xir UNIX_TIME="$(date +'%s')"

declare -xr LOCAL_SCRATCH_DIR="/scratch/${USER}/job_${SLURM_JOB_ID}"
declare -xr TMPDIR="${LOCAL_SCRATCH_DIR}"

declare -xr SYSTEM_NAME='expanse'

declare -xr SPACK_VERSION='0.17.3'
declare -xr SPACK_INSTANCE_NAME='gpu'
declare -xr SPACK_INSTANCE_VERSION='b'
declare -xr SPACK_INSTANCE_DIR="/cm/shared/apps/spack/${SPACK_VERSION}/${SPACK_INSTANCE_NAME}/${SPACK_INSTANCE_VERSION}"

declare -xr SLURM_JOB_SCRIPT="$(scontrol show job ${SLURM_JOB_ID} | awk -F= '/Command=/{print $2}')"
declare -xr SLURM_JOB_MD5SUM="$(md5sum ${SLURM_JOB_SCRIPT})"

declare -xr SCHEDULER_MODULE='slurm'
declare -xr COMPILER_MODULE='gcc/10.2.0'
declare -xr MPI_MODULE='openmpi/4.1.3'
declare -xr CUDA_MODULE='cuda/11.2.2'

echo "${UNIX_TIME} ${SLURM_JOB_ID} ${SLURM_JOB_MD5SUM} ${SLURM_JOB_DEPENDENCY}" 
echo ""

cat "${SLURM_JOB_SCRIPT}"

module purge
module load "${SCHEDULER_MODULE}"
. "${SPACK_INSTANCE_DIR}/share/spack/setup-env.sh"
module use "${SPACK_ROOT}/share/spack/lmod/linux-rocky8-x86_64/Core"
module load "${COMPILER_MODULE}"
module load "${MPI_MODULE}"
module load "${CUDA_MODULE}"
module list

# A conflict was triggered
#  condition(2667)
#  condition(2883)
#  condition(2884)
#  conflict("lammps",2883,2884)
#  no version satisfies the given constraints
#  root("lammps")
#  variant_condition(2667,"lammps","meam")
#  variant_set("lammps","meam","True")
#  version_satisfies("lammps","20181212:","20210310")
#  version_satisfies("lammps","20210310")

#1 error found in build log:
#     327    -- <<< FFT settings >>>
#     328    -- Primary FFT lib:  FFTW3
#     329    -- Using double precision FFTs
#     330    -- Using non-threaded FFTs
#     331    -- Kokkos FFT: cuFFT
#     332    -- Configuring done
#  >> 333    CMake Error: The following variables are used in this project, but 
#            they are set to NOTFOUND.
#     334    Please set them or make sure they are set and tested correctly in t
#            he CMake files:
#     335    CUDA_CUDA_LIBRARY (ADVANCED)
#     336        linked by target "nvc_get_devices" in directory /tmp/mkandes/sp
#            ack-stage/spack-stage-lammps-20210310-il34zpxttkmdye5sk2shvxlxolpi6
#            tux/spack-src/cmake
#     337        linked by target "gpu" in directory /tmp/mkandes/spack-stage/sp
#            ack-stage-lammps-20210310-il34zpxttkmdye5sk2shvxlxolpi6tux/spack-sr
#            c/cmake
#     338    
#     339    -- Generating done

# FIX: https://github.com/floydhub/dl-docker/pull/48
declare -xr CUDA_CUDA_LIBRARY='/cm/local/apps/cuda/libs/current/lib64'
declare -xr CMAKE_LIBRARY_PATH="${CUDA_CUDA_LIBRARY}"

# >> 6059    /home/mkandes/cm/shared/apps/spack/0.17.3/gpu/opt/spack/linux-rock
#             y8-cascadelake/gcc-10.2.0/kokkos-3.4.01-hkmc634lei4z23r7tvrhaag3ho
#             wgnixn/include/Cuda/Kokkos_Cuda_Parallel.hpp(464): error: calling 
#             a __host__ function("LAMMPS_NS::MinKokkos::force_clear()::[lambda(
#             int) (instance 1)]::operator ()(int) const") from a __device__ fun
#             ction("Kokkos::Impl::ParallelFor< ::LAMMPS_NS::MinKokkos::force_cl
#             ear()   ::[lambda(int) (instance 1)],  ::Kokkos::RangePolicy< ::Ko
#             kkos::Cuda > ,  ::Kokkos::Cuda> ::exec_range<void>  const") is not
#              allowed

#  condition(2731)
#  condition(2995)
#  condition(5719)
#  dependency_condition(2995,"lammps","python")
#  dependency_type(2995,"link")
#  hash("plumed","n2o2udniskgvoaacgn66fbladjkjtcai")
#  imposed_constraint("n2o2udniskgvoaacgn66fbladjkjtcai","hash","python","uasyy5n4yauliglzcgk27zmfa3ltehdy")
#  root("lammps")
#  variant_condition(2731,"lammps","python")
#  variant_condition(5719,"python","optimizations")
#  variant_set("lammps","python","True")
#  variant_set("python","optimizations","False")

# 2 errors found in build log:
#     53    -- Found CURL: /usr/lib64/libcurl.so (found version "7.61.1")
#     54    -- Checking for module 'libzstd>=1.4'
#     55    --   Found libzstd, version 1.4.4
#     56    -- Looking for C++ include cmath
#     57    -- Looking for C++ include cmath - found
#     58    -- Checking external potential C_10_10.mesocnt from https://download
#           .lammps.org/potentials
#  >> 59    CMake Error at Modules/LAMMPSUtils.cmake:101 (file):
#     60      file DOWNLOAD HASH mismatch
#     61    
#     62        for file: [/tmp/mkandes/spack-stage/spack-stage-lammps-20210310-
#           oi3mpq35ufuhx6mnichykaxhdhdrfhf4/spack-build-oi3mpq3/C_10_10.mesocnt
#           ]
#     63          expected hash: [028de73ec828b7830d762702eda571c1]
#     64            actual hash: [d41d8cd98f00b204e9800998ecf8427e]
#     65                 status: [6;"Couldn't resolve host name"]
#     66    
#     67    Call Stack (most recent call first):
#     68      CMakeLists.txt:428 (FetchPotentials)
#     69    
#     70    
#     71    -- Checking external potential TABTP_10_10.mesont from https://downl
#           oad.lammps.org/potentials
#  >> 72    CMake Error at Modules/LAMMPSUtils.cmake:101 (file):
#     73      file DOWNLOAD HASH mismatch
#     74    
#     75        for file: [/tmp/mkandes/spack-stage/spack-stage-lammps-20210310-
#           oi3mpq35ufuhx6mnichykaxhdhdrfhf4/spack-build-oi3mpq3/TABTP_10_10.mes
#           ont]
#     76          expected hash: [744a739da49ad5e78492c1fc9fd9f8c1]
#     77            actual hash: [d41d8cd98f00b204e9800998ecf8427e]
#     78                 status: [6;"Couldn't resolve host name"]

# >> 6059    /home/mkandes/cm/shared/apps/spack/0.17.3/gpu/opt/spack/linux-rock
#             y8-cascadelake/gcc-10.2.0/kokkos-3.4.01-owdzetzbm2fsbcpt3vdiv2lvvl
#             pxqwo7/include/Cuda/Kokkos_Cuda_Parallel.hpp(488): error: calling 
#             a __host__ function("LAMMPS_NS::MinKokkos::force_clear()::[lambda(
#             int) (instance 1)]::operator () const") from a __device__ function
#             ("Kokkos::Impl::ParallelFor< ::LAMMPS_NS::MinKokkos::force_clear()
#                ::[lambda(int) (instance 1)],  ::Kokkos::RangePolicy< ::Kokkos:
#             :Cuda > ,  ::Kokkos::Cuda> ::operator () const") is not allowed
#     6060    
#  >> 6061    /home/mkandes/cm/shared/apps/spack/0.17.3/gpu/opt/spack/linux-rock
#             y8-cascadelake/gcc-10.2.0/kokkos-3.4.01-owdzetzbm2fsbcpt3vdiv2lvvl
#             pxqwo7/include/Cuda/Kokkos_Cuda_Parallel.hpp(488): error: identifi
#             er "LAMMPS_NS::MinKokkos::force_clear()::[lambda(int) (instance 1)
#             ]::operator () const" is undefined in device code

declare -xr SPACK_PACKAGE='lammps@20210310'
declare -xr SPACK_COMPILER='[email protected]'
declare -xr SPACK_VARIANTS='+asphere +body +class2 +colloid +compress +coreshell +cuda cuda_arch=70,80 +dipole ~exceptions +ffmpeg +granular ~ipo +jpeg +kim +kokkos +kspace ~latte +lib +manybody +mc ~meam +misc +mliap +molecule +mpi +mpiio ~opencl +openmp +opt +peri +png +poems +python +qeq +replica +rigid +shock +snap +spin +srd ~user-adios +user-atc +user-awpmd +user-bocs +user-cgsdk +user-colvars +user-diffraction +user-dpd +user-drude +user-eff +user-fep ~user-h5md +user-lb +user-manifold +user-meamc +user-mesodpd +user-mesont +user-mgpt +user-misc +user-mofff ~user-netcdf ~user-omp +user-phonon +user-plumed +user-ptm +user-qtb +user-reaction +user-reaxc +user-sdpd +user-smd +user-smtbq +user-sph +user-tally +user-uef +user-yaff +voronoi'
declare -xr SPACK_DEPENDENCIES="^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} ~ilp64 threads=none) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} ~mpi ~openmp) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} ^kokkos-nvcc-wrapper +mpi) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} +mpi ^[email protected])" 
declare -xr SPACK_SPEC="${SPACK_PACKAGE} % ${SPACK_COMPILER} ${SPACK_VARIANTS} ${SPACK_DEPENDENCIES}"

printenv

spack config get compilers
spack config get config  
spack config get mirrors
spack config get modules
spack config get packages
spack config get repos
spack config get upstreams

time -p spack spec --long --namespaces --types --reuse $(echo "${SPACK_SPEC}")
if [[ "${?}" -ne 0 ]]; then
  echo 'ERROR: spack concretization failed.'
  exit 1
fi

time -p spack install -v --jobs "${SLURM_CPUS_PER_TASK}" --fail-fast --yes-to-all --reuse $(echo "${SPACK_SPEC}")
if [[ "${?}" -ne 0 ]]; then
  echo 'ERROR: spack install failed.'
  exit 1
fi

spack module lmod refresh --delete-tree -y

#sbatch --dependency="afterok:${SLURM_JOB_ID}" ''

sleep 30
[spack_gpu@exp-15-57 [email protected]]$ pwd
/cm/shared/apps/spack/0.17.3/gpu/b/etc/spack/sdsc/expanse/0.17.3/gpu/b/specs/[email protected]/[email protected]
[spack_gpu@exp-15-57 [email protected]]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

Starting build process with plumed.

[spack_gpu@exp-15-57 [email protected]]$ pwd
/cm/shared/apps/spack/0.17.3/gpu/b/etc/spack/sdsc/expanse/0.17.3/gpu/b/specs/[email protected]/[email protected]
[spack_gpu@exp-15-57 [email protected]]$ sbatch [email protected] 
Submitted batch job 25099884
[spack_gpu@exp-15-57 [email protected]]$ squeue -u $USER
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          25099884 ind-gpu-s plumed@2 spack_gp  R       0:02      1 exp-15-57
          25099169 ind-gpu-s     bash spack_gp  R      55:50      1 exp-15-57
[spack_gpu@exp-15-57 [email protected]]$ 

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

plumed build appears to have been successful.

[spack_gpu@exp-15-57 [email protected]]$ tail -n 30 [email protected] 
/cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/plumed-2.6.3-2ye26qbea5rvguurayn5h6elx4bo6jji/lib/libplumed.so
/cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/plumed-2.6.3-2ye26qbea5rvguurayn5h6elx4bo6jji/lib/libplumedKernel.so
A vim plugin can be found here: /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/plumed-2.6.3-2ye26qbea5rvguurayn5h6elx4bo6jji/lib/plumed/vim/
Copy it to /home/spack_gpu/.vim/ directory
Alternatively:
- Set this environment variable         : PLUMED_VIMPATH=/cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/plumed-2.6.3-2ye26qbea5rvguurayn5h6elx4bo6jji/lib/plumed/vim
- Add the command 'let &runtimepath.=','.$PLUMED_VIMPATH' to your .vimrc file
From vim, you can use :set syntax=plumed to enable it
A python plugin can be found here: /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/plumed-2.6.3-2ye26qbea5rvguurayn5h6elx4bo6jji/lib/plumed/python/
To use PLUMED through python either : 
- Add /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/plumed-2.6.3-2ye26qbea5rvguurayn5h6elx4bo6jji/lib/plumed/python/ to your PYTHONPATH
- Execute the command python buildPythonInterface.py install in the plumed2/python directory
Plumed can be loaded in a python script using the command import plumed
WARNING: plumed executable will not run on this machine
WARNING: unless you invoke it as 'plumed --no-mpi'
WARNING: This is normal if this is the login node of a cluster.
WARNING: - to patch an MD code now use 'plumed --no-mpi patch'
WARNING:   (notice that MPI will be available anyway in the patched code)
WARNING: - all command line tools are available as 'plumed --no-mpi name-of-the-tool'
WARNING:   e.g. 'plumed --no-mpi driver'
WARNING:   (MPI will be disabled in this case)
make[2]: Leaving directory '/scratch/spack_gpu/job_25099884/spack-stage/spack-stage-plumed-2.6.3-2ye26qbea5rvguurayn5h6elx4bo6jji/spack-src/src/lib'
make[1]: Leaving directory '/scratch/spack_gpu/job_25099884/spack-stage/spack-stage-plumed-2.6.3-2ye26qbea5rvguurayn5h6elx4bo6jji/spack-src/src'
==> plumed: Successfully installed plumed-2.6.3-2ye26qbea5rvguurayn5h6elx4bo6jji
  Fetch: 3.57s.  Build: 12m 10.70s.  Total: 12m 14.27s.
[+] /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/plumed-2.6.3-2ye26qbea5rvguurayn5h6elx4bo6jji
real 747.52
user 623.83
sys 115.40
==> Regenerating lmod module files
[spack_gpu@exp-15-57 [email protected]]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

Next running kokkos +mpi spec build.

[spack_gpu@exp-15-57 [email protected]]$ sbatch [email protected] 
Submitted batch job 25100047
[spack_gpu@exp-15-57 [email protected]]$ squeue -u $USER
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          25100047 ind-gpu-s kokkos@3 spack_gp  R       0:04      1 exp-15-57
          25099169 ind-gpu-s     bash spack_gp  R    1:11:38      1 exp-15-57
[spack_gpu@exp-15-57 [email protected]]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

Build succeeded.

[spack_gpu@exp-15-57 [email protected]]$ tail -n 20 [email protected] 
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
CMake Warning:
  Manually-specified variables were not used by the project:

    SPACK_PACKAGE_INSTALL_DIR
    SPACK_PACKAGE_TEST_ROOT_DIR


-- Build files have been written to: /scratch/spack_gpu/job_25100047/spack-stage/spack-stage-kokkos-3.4.01-ynb455ej3kpkplv4fbbhkfsqgof7m6sc/spack-src
==> [2023-09-07-20:28:35.601058] Installing /scratch/spack_gpu/job_25100047/spack-stage/spack-stage-kokkos-3.4.01-ynb455ej3kpkplv4fbbhkfsqgof7m6sc/spack-src/scripts/spack_test/out to /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/kokkos-3.4.01-ynb455ej3kpkplv4fbbhkfsqgof7m6sc/.spack/test/scripts/spack_test/out
==> kokkos: Successfully installed kokkos-3.4.01-ynb455ej3kpkplv4fbbhkfsqgof7m6sc
  Fetch: 0.61s.  Build: 20.49s.  Total: 21.10s.
[+] /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/kokkos-3.4.01-ynb455ej3kpkplv4fbbhkfsqgof7m6sc
real 36.04
user 73.33
sys 30.57
==> Regenerating lmod module files
[spack_gpu@exp-15-57 [email protected]]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

Finally, running new lammps spec build script.

[spack_gpu@exp-15-57 [email protected]]$ sbatch [email protected] 
Submitted batch job 25100062
[spack_gpu@exp-15-57 [email protected]]$ squeue -u $USER
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          25100062 ind-gpu-s lammps@2 spack_gp  R       0:03      1 exp-15-57
          25099169 ind-gpu-s     bash spack_gp  R    1:15:34      1 exp-15-57
[spack_gpu@exp-15-57 [email protected]]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

Initial build failed. Investigating ...

[mkandes@login01 ~]$ cd /cm/shared/apps/spack/0.17.3/gpu/b/etc/spack/
defaults/ licenses/ sdsc/     
[mkandes@login01 ~]$ cd /cm/shared/apps/spack/0.17.3/gpu/b/etc/spack/sdsc/expanse/0.17.3/gpu/b/specs/[email protected]/[email protected]/
[mkandes@login01 [email protected]]$ ls -lahtr
total 19M
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.2K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.7K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.4K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  3.2K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.4K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.6K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.2K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.2K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.2K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.4K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack   38K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.4K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.8K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  3.1K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.2K Apr 23 12:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Apr 23 12:35 [email protected]
-rw-r--r-- 1 spack_gpu spack   25K Apr 23 12:36 [email protected]
-rw-r--r-- 1 spack_gpu spack   25K Apr 23 12:37 [email protected]
-rw-r--r-- 1 spack_gpu spack   24K Apr 23 12:38 [email protected]
-rw-r--r-- 1 spack_gpu spack   26K Apr 23 12:40 [email protected]
-rw-r--r-- 1 spack_gpu spack   24K Apr 23 12:44 [email protected]
-rw-r--r-- 1 spack_gpu spack   31K Apr 23 12:47 [email protected]
-rw-r--r-- 1 spack_gpu spack   27K Apr 23 12:48 [email protected]
-rw-r--r-- 1 spack_gpu spack   28K Apr 23 12:53 [email protected]
-rw-r--r-- 1 spack_gpu spack   33K Apr 23 12:54 [email protected]
-rw-r--r-- 1 spack_gpu spack   29K Apr 23 12:56 [email protected]
-rw-r--r-- 1 spack_gpu spack   33K Apr 23 12:57 [email protected]
-rw-r--r-- 1 spack_gpu spack   29K Apr 23 12:59 [email protected]
-rw-r--r-- 1 spack_gpu spack   34K Apr 23 13:00 [email protected]
-rw-r--r-- 1 spack_gpu spack   40K Apr 23 13:03 [email protected]
-rw-r--r-- 1 spack_gpu spack   41K Apr 23 13:05 [email protected]
-rw-r--r-- 1 spack_gpu spack   26K Apr 23 13:06 [email protected]
-rw-r--r-- 1 spack_gpu spack   38K Apr 23 13:11 [email protected]
-rw-r--r-- 1 spack_gpu spack   31K Apr 23 13:13 [email protected]
-rw-r--r-- 1 spack_gpu spack   31K Apr 23 13:17 [email protected]
-rw-r--r-- 1 spack_gpu spack   37K Apr 23 13:19 [email protected]
-rw-r--r-- 1 spack_gpu spack   35K Apr 23 13:20 [email protected]
-rw-r--r-- 1 spack_gpu spack   56K Apr 23 13:21 [email protected]
-rw-r--r-- 1 spack_gpu use300 4.5K Aug 27 13:15 [email protected]
drwxr-sr-x 5 spack_gpu spack   176 Sep  7 12:33 ..
-rw-r--r-- 1 spack_gpu spack  9.1K Sep  7 20:03 [email protected]
-rw-r--r-- 1 spack_gpu spack  3.4K Sep  7 20:08 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.6K Sep  7 20:08 [email protected]
-rw-r--r-- 1 spack_gpu spack  325K Sep  7 20:25 [email protected]
-rw-r--r-- 1 spack_gpu spack  143K Sep  7 20:29 [email protected]
drwxr-sr-x 2 spack_gpu spack    52 Sep  7 20:31 .
-rw-r--r-- 1 spack_gpu spack   17M Sep  7 21:04 [email protected]
[mkandes@login01 [email protected]]$ tail -n 20 [email protected] 
             .0] Error 1
     6460    make[2]: Leaving directory '/scratch/spack_gpu/job_25100062/spack-
             stage/spack-stage-lammps-20210310-aoskuicagkrsuphtdu64yi7fkppxwskz
             /spack-build-aoskuic'
  >> 6461    make[1]: *** [CMakeFiles/Makefile2:1121: CMakeFiles/lammps.dir/all
             ] Error 2
     6462    make[1]: Leaving directory '/scratch/spack_gpu/job_25100062/spack-
             stage/spack-stage-lammps-20210310-aoskuicagkrsuphtdu64yi7fkppxwskz
             /spack-build-aoskuic'
     6463    make: *** [Makefile:139: all] Error 2

See build log for details:
  /scratch/spack_gpu/job_25100062/spack-stage/spack-stage-lammps-20210310-aoskuicagkrsuphtdu64yi7fkppxwskz/spack-build-out.txt

==> Error: Terminating after first install failure: ProcessError: Command exited with status 2:
    'make' '-j10'
real 1945.89
user 6750.37
sys 11169.26
ERROR: spack install failed.
[mkandes@login01 [email protected]]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

Related to the mt_mpi issue from plumed that @jerrypgreenberg mitigated?

 /cm/local/apps/cuda/libs/current/lib64/libcuda.so -lgfortran -lquadmath 
g++: error: unrecognized command-line option '-mt_mpi'
make[2]: *** [CMakeFiles/lammps.dir/build.make:20104: liblammps.so.0] Error 1
make[2]: Leaving directory '/scratch/spack_gpu/job_25100062/spack-stage/spack-stage-lammps-20210310-aoskuicagkrsuphtdu64yi7fkppxwskz/spack-build-aoskuic'
make[1]: *** [CMakeFiles/Makefile2:1121: CMakeFiles/lammps.dir/all] Error 2
make[1]: Leaving directory '/scratch/spack_gpu/job_25100062/spack-stage/spack-stage-lammps-20210310-aoskuicagkrsuphtdu64yi7fkppxwskz/spack-build-aoskuic'
make: *** [Makefile:139: all] Error 2
==> Error: ProcessError: Command exited with status 2:
    'make' '-j10'

3 errors found in build log:
...
             2psofa3wr2zumqrnh4je2f7ze3mx/lib64/libcudart_static.a -ldl -lpthre
             ad /usr/lib64/librt.so /cm/local/apps/cuda/libs/current/lib64/libc
             uda.so -lgfortran -lquadmath
  >> 6458    g++: error: unrecognized command-line option '-mt_mpi'
  >> 6459    make[2]: *** [CMakeFiles/lammps.dir/build.make:20104: liblammps.so
             .0] Error 1
     6460    make[2]: Leaving directory '/scratch/spack_gpu/job_25100062/spack-
             stage/spack-stage-lammps-20210310-aoskuicagkrsuphtdu64yi7fkppxwskz
             /spack-build-aoskuic'
  >> 6461    make[1]: *** [CMakeFiles/Makefile2:1121: CMakeFiles/lammps.dir/all
             ] Error 2
     6462    make[1]: Leaving directory '/scratch/spack_gpu/job_25100062/spack-
             stage/spack-stage-lammps-20210310-aoskuicagkrsuphtdu64yi7fkppxwskz
             /spack-build-aoskuic'
     6463    make: *** [Makefile:139: all] Error 2

See build log for details:
  /scratch/spack_gpu/job_25100062/spack-stage/spack-stage-lammps-20210310-aoskuicagkrsuphtdu64yi7fkppxwskz/spack-build-out.txt

==> Error: Terminating after first install failure: ProcessError: Command exited with status 2:
    'make' '-j10'
real 1945.89
user 6750.37
sys 11169.26
ERROR: spack install failed.

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

The answer appears to be YES. Removing the plumed dependency allowed the build to proceed successfully.

[spack_gpu@exp-15-57 [email protected]]$ tail -n 20 [email protected]
copying /scratch/spack_gpu/job_25105740/spack-stage/spack-stage-lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/spack-build-a6zspha/python/lib/lammps/__init__.py -> /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/lib/python3.8/site-packages/lammps
byte-compiling /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/lib/python3.8/site-packages/lammps/constants.py to constants.cpython-38.pyc
byte-compiling /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/lib/python3.8/site-packages/lammps/formats.py to formats.cpython-38.pyc
byte-compiling /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/lib/python3.8/site-packages/lammps/pylammps.py to pylammps.cpython-38.pyc
byte-compiling /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/lib/python3.8/site-packages/lammps/core.py to core.cpython-38.pyc
byte-compiling /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/lib/python3.8/site-packages/lammps/data.py to data.cpython-38.pyc
byte-compiling /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/lib/python3.8/site-packages/lammps/mliap/loader.py to loader.cpython-38.pyc
byte-compiling /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/lib/python3.8/site-packages/lammps/mliap/pytorch.py to pytorch.cpython-38.pyc
byte-compiling /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/lib/python3.8/site-packages/lammps/mliap/__init__.py to __init__.cpython-38.pyc
byte-compiling /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/lib/python3.8/site-packages/lammps/numpy_wrapper.py to numpy_wrapper.cpython-38.pyc
byte-compiling /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/lib/python3.8/site-packages/lammps/__init__.py to __init__.cpython-38.pyc
running install_egg_info
Writing /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i/lib/python3.8/site-packages/lammps-10Mar2021-py3.8.egg-info
==> lammps: Successfully installed lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i
  Fetch: 0.31s.  Build: 1h 3m 24.62s.  Total: 1h 3m 24.93s.
[+] /cm/shared/apps/spack/0.17.3/gpu/b/opt/spack/linux-rocky8-cascadelake/gcc-10.2.0/lammps-20210310-a6zsphabx6kl4hqccibvhxc2axcs5n5i
real 3820.70
user 12714.66
sys 20560.06
==> Regenerating lmod module files
[spack_gpu@exp-15-57 [email protected]]$ cat [email protected] | grep plumed
#  hash("plumed","n2o2udniskgvoaacgn66fbladjkjtcai")
declare -xr SPACK_VARIANTS='+asphere +body +class2 +colloid +compress +coreshell +cuda cuda_arch=70,80 +dipole ~exceptions +ffmpeg +granular ~ipo +jpeg +kim +kokkos +kspace ~latte +lib +manybody +mc ~meam +misc +mliap +molecule +mpi +mpiio ~opencl +openmp +opt +peri +png +poems +python +qeq +replica +rigid +shock +snap +spin +srd ~user-adios +user-atc +user-awpmd +user-bocs +user-cgsdk +user-colvars +user-diffraction +user-dpd +user-drude +user-eff +user-fep ~user-h5md +user-lb +user-manifold +user-meamc +user-mesodpd +user-mesont +user-mgpt +user-misc +user-mofff ~user-netcdf ~user-omp +user-phonon ~user-plumed +user-ptm +user-qtb +user-reaction +user-reaxc +user-sdpd +user-smd +user-smtbq +user-sph +user-tally +user-uef +user-yaff +voronoi'
declare -xr SPACK_DEPENDENCIES="^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} ~ilp64 threads=none) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} ~mpi ~openmp) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER}) ^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} ^kokkos-nvcc-wrapper +mpi)" #^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER} +mpi ^[email protected])" 
[spack_gpu@exp-15-57 [email protected]]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

Build ready for testing.

[mkandes@login01 ~]$ module load gpu
[mkandes@login01 ~]$ module load gcc/10.2.0
[mkandes@login01 ~]$ module load intel-mpi/2019.10.317
[mkandes@login01 ~]$ module load lammps/20210310
[mkandes@login01 ~]$ module list

Currently Loaded Modules:
  1) shared                              9) eigen/3.4.0/aoih524
  2) slurm/expanse/21.08.8              10) ffmpeg/4.3.2/w7ehiyr
  3) sdsc/1.0                           11) fftw/3.3.10/7ahyh5v
  4) DefaultModules                     12) kokkos/3.4.01/ynb455e
  5) gpu/0.17.3b                   (g)  13) openblas/0.3.18/lsmegf6
  6) gcc/10.2.0/i62tgso                 14) python/3.8.12/uasyy5n
  7) intel-mpi/2019.10.317/jhyxn2g      15) lammps/20210310/a6zspha-omp
  8) cuda/11.2.2/blza2ps

  Where:
   g:  built natively for Intel Skylake

 

[mkandes@login01 ~]$

@mkandes
Copy link
Member

mkandes commented Sep 8, 2023

Previous runtime error.

LAMMPS (10 Mar 2021)
  using 1 OpenMP thread(s) per MPI task
Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962
Created orthogonal box = (0.0000000 0.0000000 0.0000000) to (134.36770 67.183848 67.183848)
  2 by 1 by 2 MPI processor grid
Created 512000 atoms
  create_atoms CPU = 0.007 seconds

CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE

Your simulation uses code contributions which should be cited:
- GPU package (short-range, long-range and three-body potentials):
The log file lists these citations in BibTeX format.

CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE


--------------------------------------------------------------------------
- Using acceleration for lj/cut:
-  with 1 proc(s) per device.
-  Horizontal vector operations: ENABLED
-  Shared memory system: No
--------------------------------------------------------------------------
Device 0: Tesla V100-SXM2-32GB, 80 CUs, 31/32 GB, 1.5 GHZ (Mixed Precision)
Device 1: Tesla V100-SXM2-32GB, 80 CUs, 1.5 GHZ (Mixed Precision)
Device 2: Tesla V100-SXM2-32GB, 80 CUs, 1.5 GHZ (Mixed Precision)
Device 3: Tesla V100-SXM2-32GB, 80 CUs, 1.5 GHZ (Mixed Precision)
--------------------------------------------------------------------------

Initializing Device and compiling on process 0...Done.
Initializing Devices 0-3 on core 0...Done.

Setting up Verlet run ...
  Unit style    : lj
  Current step  : 0
  Time step     : 0.005
Per MPI rank memory allocation (min/avg/max) = 28.61 | 28.61 | 28.61 Mbytes
Step Temp E_pair E_mol TotEng Press 
       0         1.44          inf            0          inf          nan 
      10          nan            0            0          nan          nan 
Loop time of 0.034794 on 4 procs for 10 steps with 512000 atoms

Performance: 124159.262 tau/day, 287.406 timesteps/s
88.6% CPU use with 4 MPI tasks x 1 OpenMP threads

MPI task timing breakdown:
Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
Pair    | 0.012956   | 0.013701   | 0.014315   |   0.5 | 39.38
Neigh   | 0          | 0          | 0          |   0.0 |  0.00
Comm    | 0.0094468  | 0.010145   | 0.011195   |   0.7 | 29.16
Output  | 0.00039474 | 0.00046466 | 0.00055555 |   0.0 |  1.34
Modify  | 0.0075273  | 0.0078619  | 0.0081865  |   0.3 | 22.60
Other   |            | 0.002621   |            |       |  7.53

Nlocal:        128000.0 ave      128000 max      128000 min
Histogram: 4 0 0 0 0 0 0 0 0 0
Nghost:        49871.0 ave       49871 max       49871 min
Histogram: 4 0 0 0 0 0 0 0 0 0
Neighs:         0.00000 ave           0 max           0 min
Histogram: 4 0 0 0 0 0 0 0 0 0

Total # of neighbors = 0
Ave neighs/atom = 0.0000000
Neighbor list builds = 0
Dangerous builds not checked


---------------------------------------------------------------------
      Device Time Info (average): 
---------------------------------------------------------------------
Data Transfer:   0.0062 s.
Neighbor copy:   0.0000 s.
Neighbor build:  0.0015 s.
Force calc:      0.0012 s.
Device Overhead: 0.0004 s.
Average split:   1.0000.
Lanes / atom:    4.
Vector width:    32.
Max Mem / Proc:  178.54 MB.
CPU Cast/Pack:   0.0131 s.
CPU Driver_Time: 0.0002 s.
CPU Idle_Time:   0.0017 s.
---------------------------------------------------------------------


--------------------------------------------------------------------------
- Using acceleration for lj/cut:
-  with 1 proc(s) per device.
-  Horizontal vector operations: ENABLED
-  Shared memory system: No
--------------------------------------------------------------------------
Device 0: Tesla V100-SXM2-32GB, 80 CUs, 31/32 GB, 1.5 GHZ (Mixed Precision)
Device 1: Tesla V100-SXM2-32GB, 80 CUs, 1.5 GHZ (Mixed Precision)
Device 2: Tesla V100-SXM2-32GB, 80 CUs, 1.5 GHZ (Mixed Precision)
Device 3: Tesla V100-SXM2-32GB, 80 CUs, 1.5 GHZ (Mixed Precision)
--------------------------------------------------------------------------

Initializing Device and compiling on process 0...Done.
Initializing Devices 0-3 on core 0...Done.

Setting up Verlet run ...
  Unit style    : lj
  Current step  : 10
  Time step     : 0.005
ERROR on proc 0: Non-numeric atom coords - simulation unstable (src/src/domain.cpp:548)
ERROR on proc 2: Non-numeric atom coords - simulation unstable (src/src/domain.cpp:548)
ERROR on proc 1: Non-numeric atom coords - simulation unstable (src/src/domain.cpp:548)
ERROR on proc 3: Non-numeric atom coords - simulation unstable (src/src/domain.cpp:548)
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[exp-15-58:3985063] 3 more processes have sent help message help-mpi-api.txt / mpi-abort
[exp-15-58:3985063] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
real 3.59
user 0.02
sys 0.22
[mkandes@login01 lammps]$

@mkandes
Copy link
Member

mkandes commented Sep 22, 2023

Built and deployed a new install of lammps@20210310 within the new [email protected] package dependency chain added to expanse/0.17.3/gpu/b to support amber@22.

...
drwxr-sr-x 3 spack_gpu spack    65 Sep 20 08:18 ..
-rw-r--r-- 1 spack_gpu use300 4.0K Sep 20 18:13 [email protected]
-rw-r--r-- 1 spack_gpu spack   24M Sep 20 18:51 [email protected]
-rw-r--r-- 1 spack_gpu spack  2.3K Sep 21 14:21 [email protected]
-rw-r--r-- 1 spack_gpu spack  158K Sep 21 14:24 [email protected]
-rw-r--r-- 1 spack_gpu spack  3.3K Sep 21 14:25 [email protected]
-rw-r--r-- 1 spack_gpu spack  474K Sep 21 14:34 [email protected]
-rw-r--r-- 1 spack_gpu spack  8.8K Sep 22 08:45 [email protected]
drwxr-sr-x 2 spack_gpu spack    54 Sep 22 08:45 .
-rw-r--r-- 1 spack_gpu spack   34M Sep 22 09:52 [email protected]
[spack_gpu@exp-15-57 [email protected]]$ pwd
/cm/shared/apps/spack/0.17.3/gpu/b/etc/spack/sdsc/expanse/0.17.3/gpu/b/specs/[email protected]/[email protected]
[spack_gpu@exp-15-57 [email protected]]$

@mkandes
Copy link
Member

mkandes commented Sep 22, 2023

@nwolter - Ready for testing.

@mkandes
Copy link
Member

mkandes commented Sep 22, 2023

[mkandes@login01 ~]$ module spider lammps/20210310/ytjmvfx-omp

----------------------------------------------------------------------------
  lammps/20210310: lammps/20210310/ytjmvfx-omp
----------------------------------------------------------------------------

    You will need to load all module(s) on any one of the lines below before the "lammps/20210310/ytjmvfx-omp" module is available to load.

      gpu/0.17.3b  gcc/8.4.0/xiuwkua  openmpi/4.1.3/v2ei3ge
 
    Help:
      LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel
      Simulator. This package uses patch releases, not stable release. See
      https://github.com/spack/spack/pull/5342 for a detailed discussion.


 

[mkandes@login01 ~]$

@nwolter
Copy link
Author

nwolter commented Sep 22, 2023

Interesting. Spider command does not show version you have listed above on login nodes on Expanse for me. The openmpi version you show above is not available?

]$ module spider  lammps/20210310

------------------------------------------------------------------------------------------------------------------
  lammps/20210310:
------------------------------------------------------------------------------------------------------------------
     Versions:
        lammps/20210310/a6zspha-omp
        lammps/20210310/jqd2pok-omp
        lammps/20210310/k7oi5an-omp

gpu]$ module spider lammps/20210310/a6zspha-omp


lammps/20210310: lammps/20210310/a6zspha-omp

You will need to load all module(s) on any one of the lines below before the "lammps/20210310/a6zspha-omp" module is available to load.

  gpu/0.17.3b  gcc/10.2.0/i62tgso  intel-mpi/2019.10.317/jhyxn2g

@nwolter
Copy link
Author

nwolter commented Sep 22, 2023

to fix issue I deleted the lmod.d, but this is the official explanation.

If your site adds a new modulefile to the site’s $MODULEPATH but are unable to see it with module avail?

It is likely that your site is having an spider cache issue. If you see different results from the following commands then that is the problem:

$ module --ignore_cache avail
$ module avail

If you see a difference between the above two commands, delete (if it exists) the user’s spider cache:

$ rm -rf ~/.cache/lmod ~/.lmod.d/cache

@mkandes
Copy link
Member

mkandes commented Sep 25, 2023

Build spec script and standard output committed back to deployment branch.

c1ff2f6

@mkandes
Copy link
Member

mkandes commented Oct 27, 2023

@nwolter @mahidhar - We still need to create an updated example for expanse/0.17.3/gpu/b.

[mkandes@login01 lammps]$ pwd
/cm/shared/examples/sdsc/lammps
[mkandes@login01 lammps]$ ls -lahtr
total 0
drwxrwxr-x  2 mahidhar use300 10 May 16 10:24 cpu_stack_0.15.4
drwxrwxr-x  2 mahidhar use300  3 May 16 10:24 gpu_stack_0.15.4
drwxrwxr-x 47 root     use300 45 Aug  1 14:11 ..
drwxrwsr-x  2 mahidhar use300  7 Sep 15 17:13 cpu_stack_0.17.3b
drwxrwxr-x  5 mahidhar use300  3 Sep 15 17:15 .
[mkandes@login01 lammps]$

We have this version ...

[mkandes@login01 lammps]$ module spider lammps/20210310/ytjmvfx-omp

----------------------------------------------------------------------------
  lammps/20210310: lammps/20210310/ytjmvfx-omp
----------------------------------------------------------------------------

    You will need to load all module(s) on any one of the lines below before the "lammps/20210310/ytjmvfx-omp" module is available to load.

      gpu/0.17.3b  gcc/8.4.0/xiuwkua  openmpi/4.1.3/v2ei3ge
 
    Help:
      LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel
      Simulator. This package uses patch releases, not stable release. See
      https://github.com/spack/spack/pull/5342 for a detailed discussion.

[mkandes@login01 lammps]$

... and this version now.

[mkandes@login01 lammps]$ module spider lammps/20230802

----------------------------------------------------------------------------
  lammps/20230802: lammps/20230802/7e7qu7z-omp
----------------------------------------------------------------------------

    You will need to load all module(s) on any one of the lines below before the "lammps/20230802/7e7qu7z-omp" module is available to load.

      gpu/0.17.3b  gcc/10.2.0/i62tgso  openmpi/4.1.3/gzzscfu
 
    Help:
      LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel
      Simulator. This package uses patch releases, not stable release. See
      https://github.com/spack/spack/pull/5342 for a detailed discussion.

[mkandes@login01 lammps]$

I think the only major difference is PLUMED is supported for lammps/20210310/ytjmvfx-omp.

[mkandes@login01 ~]$ source activate-shared-spack-instance.sh 
[mkandes@login01 ~]$ spack find -lvd lammps
==> 2 installed packages
-- linux-rocky8-cascadelake / [email protected] ------------------------
7e7qu7z lammps@20230802+asphere+body+class2+colloid+compress+coreshell+cuda~cuda_mps+dipole~exceptions~ffmpeg+granular~ipo+jpeg+kim+kokkos+kspace~latte+lib+manybody+mc~meam+misc+mliap+molecule+mpi+mpiio~opencl+openmp+opt+peri+png+poems+python+qeq+replica+rigid+shock+snap+spin+srd~user-adios+user-atc+user-awpmd+user-bocs+user-cgsdk+user-colvars+user-diffraction+user-dpd+user-drude+user-eff+user-fep~user-h5md+user-lb+user-manifold+user-meamc+user-mesodpd+user-mesont+user-mgpt+user-misc+user-mofff~user-netcdf+user-omp+user-phonon~user-plumed+user-ptm+user-qtb+user-reaction+user-reaxc+user-sdpd+user-smd+user-smtbq+user-sph+user-tally+user-uef+user-yaff+voronoi build_type=RelWithDebInfo cuda_arch=70
blza2ps     [email protected]~dev
2q4yola         [email protected]~python
5a3xt3s             [email protected] libs=shared,static
5xho2dj             [email protected]~pic libs=shared,static
2c5fvip             [email protected]+optimize+pic+shared
aoih524     [email protected]~ipo build_type=RelWithDebInfo
7ahyh5v     [email protected]~mpi~openmp~pfft_patches precision=double,float
eqgx2bp     [email protected]~ipo build_type=RelWithDebInfo
phogmfw     [email protected]~aggressive_vectorization~compiler_warnings+cuda+cuda_constexpr+cuda_lambda+cuda_ldg_intrinsic~cuda_relocatable_device_code~cuda_uvm~debug~debug_bounds_check~debug_dualview_modify_check~deprecated_code~examples~explicit_instantiation~hpx~hpx_async_dispatch~hwloc~ipo~memkind~numactl~openmp+pic+profiling~profiling_load_print~pthread~qthread~rocm+serial+shared~sycl~tests~tuning+wrapper amdgpu_target=none build_type=RelWithDebInfo cuda_arch=70 std=14
7urw4af         [email protected]+mpi
gzzscfu             [email protected]~atomics+cuda~cxx~cxx_exceptions~gpfs~internal-hwloc~java+legacylaunchers+lustre~memchecker+pmi+pmix+romio~rsh~singularity+static+vt+wrapper-rpath cuda_arch=70,80 fabrics=ucx schedulers=slurm
okiyq35                 [email protected]~cairo+cuda~gl~libudev+libxml2~netloc~nvml~opencl+pci~rocm+shared
x3u2rw4                     [email protected]
5jrknc3                     [email protected]~symlinks+termlib abi=none
ne2joyw                 [email protected]~openssl
73aggpy                 [email protected]
34rinp4                 [email protected] patches=4e1d78cbbb85de625bad28705e748856033eaafab92a66dffd383a3d7e00cc94,62fc8a8bf7665a60e8f4c93ebbd535647cebf74198f7afafec4c085a8825c006,ff37630df599cfabf0740518b91ec8daaf18e8f288b19adaae5364dc1f6b2296
43uenl2                 [email protected]~docs+pmi_backwards_compatibility~restful
nflzb3l                 [email protected]~gtk~hdf5~hwloc~mariadb~pmix+readline~restd sysconfdir=PREFIX/etc
msro2p7                 [email protected]~assertions~cm+cma+cuda+dc~debug+dm+gdrcopy+ib-hw-tm~java~knem~logging+mlx5-dv+optimizations~parameter_checking+pic+rc~rocm+thread_multiple+ud~xpmem cuda_arch=70,80
nefrbxp                     [email protected]
k7hbyv2                     [email protected]~ipo build_type=RelWithDebInfo
uwaulm2     [email protected]
a7krc3b     [email protected]
lsmegf6     [email protected]~bignuma~consistent_fpcsr~ilp64+locking+pic+shared threads=none
uasyy5n     [email protected]+bz2+ctypes+dbm~debug+libxml2+lzma~nis+optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3+ssl~tix~tkinter~ucs4+uuid+zlib patches=0d98e93189bc278fbc37a50ed7f183bd8aaf249a8e1670a465f0db6bb4f8cf87,4c2457325f2b608b1b6a2c63087df8c26e07db3e3d493caf36a56f0ecf6fb768,f2fd060afc4b4618fe8104c4c5d771f36dc55b1db5a4623785a4ea707ec72fb4
nlagguk         [email protected]~debug~pic+shared
5hqjubn         [email protected]+libbsd
c6r2tgf             [email protected]
v3ceevj                 [email protected]
zlcmcv5         [email protected]
57jahak             [email protected]
4mzhvmy         [email protected]+bzip2+curses+git~libunistring+libxml2+tar+xz
vlxqcge             [email protected]
zvema6a         [email protected] patches=26f26c6f29a7ce9bf370ad3ab2610f99365b4bdd7b82e7c31df41a3370d685c0
47xsvyk         [email protected]~docs certs=system
uqy7ybg         [email protected]+column_metadata+fts~functions~rtree
fjihfsz         [email protected]
w3k24sh     [email protected]+pic


-- linux-rocky8-skylake_avx512 / [email protected] ----------------------
ytjmvfx lammps@20210310+asphere+body+class2+colloid+compress+coreshell+cuda~cuda_mps+dipole~exceptions~ffmpeg+granular~ipo+jpeg+kim+kokkos+kspace~latte+lib+manybody+mc~meam+misc+mliap+molecule+mpi+mpiio~opencl+openmp+opt+peri+png+poems+python+qeq+replica+rigid+shock+snap+spin+srd~user-adios+user-atc+user-awpmd+user-bocs+user-cgsdk+user-colvars+user-diffraction+user-dpd+user-drude+user-eff+user-fep~user-h5md+user-lb+user-manifold+user-meamc+user-mesodpd+user-mesont+user-mgpt+user-misc+user-mofff~user-netcdf~user-omp+user-phonon+user-plumed+user-ptm+user-qtb+user-reaction+user-reaxc+user-sdpd+user-smd+user-smtbq+user-sph+user-tally+user-uef+user-yaff+voronoi build_type=RelWithDebInfo cuda_arch=70
yx5cxnu     [email protected]~dev
cctghrh         [email protected]~python
ksynfmj             [email protected] libs=shared,static
udmvado             [email protected]~pic libs=shared,static
7apv7tj             [email protected]+optimize+pic+shared
ygkbggs     [email protected]~ipo build_type=RelWithDebInfo
qfar473     [email protected]~mpi~openmp~pfft_patches precision=double,float
iiwtfrq     [email protected]~ipo build_type=RelWithDebInfo
jh4pw54     [email protected]~aggressive_vectorization~compiler_warnings+cuda+cuda_constexpr+cuda_lambda+cuda_ldg_intrinsic~cuda_relocatable_device_code~cuda_uvm~debug~debug_bounds_check~debug_dualview_modify_check~deprecated_code~examples~explicit_instantiation~hpx~hpx_async_dispatch~hwloc~ipo~memkind~numactl~openmp+pic+profiling~profiling_load_print~pthread~qthread~rocm+serial+shared~sycl~tests~tuning+wrapper amdgpu_target=none build_type=RelWithDebInfo cuda_arch=70 std=14
hxjlokx         [email protected]+mpi
v2ei3ge             [email protected]~atomics+cuda~cxx~cxx_exceptions~gpfs~internal-hwloc~java+legacylaunchers+lustre~memchecker+pmi+pmix+romio~rsh~singularity+static+vt+wrapper-rpath cuda_arch=70 fabrics=ucx schedulers=slurm
7evhgqy                 [email protected]~cairo+cuda~gl~libudev+libxml2~netloc~nvml~opencl+pci~rocm+shared
dusortv                     [email protected]
fj4m6bg                     [email protected]~symlinks+termlib abi=none
wgoayfz                 [email protected]+openssl
um4do2r                     [email protected]~docs certs=system
dfynryt                 [email protected]
gpjef7m                 [email protected] patches=4e1d78cbbb85de625bad28705e748856033eaafab92a66dffd383a3d7e00cc94,62fc8a8bf7665a60e8f4c93ebbd535647cebf74198f7afafec4c085a8825c006,ff37630df599cfabf0740518b91ec8daaf18e8f288b19adaae5364dc1f6b2296
gl4mdry                 [email protected]~docs+pmi_backwards_compatibility~restful
mctz53r                 [email protected]~gtk~hdf5~hwloc~mariadb~pmix+readline~restd sysconfdir=PREFIX/etc
fkguskk                 [email protected]~assertions~cm+cma+cuda+dc~debug+dm+gdrcopy+ib-hw-tm~java~knem~logging+mlx5-dv+optimizations~parameter_checking+pic+rc~rocm+thread_multiple+ud~xpmem cuda_arch=70
ntvmzwk                     [email protected]
53wlz3i                     [email protected]~ipo build_type=RelWithDebInfo
zne4hoj     [email protected]
3trf423     [email protected]
n4ay3co     [email protected]~bignuma~consistent_fpcsr~ilp64+locking+pic+shared threads=none
5u64glq     [email protected]+gsl+mpi+shared arrayfire=none optional_modules=all
gfpesxa         [email protected]~external-cblas
nmsf2rf     [email protected]+bz2+ctypes+dbm~debug+libxml2+lzma~nis~optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3+ssl~tix~tkinter~ucs4+uuid+zlib patches=0d98e93189bc278fbc37a50ed7f183bd8aaf249a8e1670a465f0db6bb4f8cf87,4c2457325f2b608b1b6a2c63087df8c26e07db3e3d493caf36a56f0ecf6fb768,f2fd060afc4b4618fe8104c4c5d771f36dc55b1db5a4623785a4ea707ec72fb4
zg6j6o4         [email protected]~debug~pic+shared
2pgh5u7         [email protected]+libbsd
efhliq2             [email protected]
wvibiuk                 [email protected]
glugi6a         [email protected]
6wqnjhx             [email protected]
iap64mm         [email protected]+bzip2+curses+git~libunistring+libxml2+tar+xz
p7jtkdm             [email protected]
w6bcjqa         [email protected] patches=26f26c6f29a7ce9bf370ad3ab2610f99365b4bdd7b82e7c31df41a3370d685c0
mezd2bx         [email protected]+column_metadata+fts~functions~rtree
fajrrs3         [email protected]
4pcy3u7     [email protected]+pic

[mkandes@login01 ~]$

@nwolter
Copy link
Author

nwolter commented Oct 27, 2023

I'll get on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants