Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One of the checks for sundials doesn't support sundials built with MPI support #1843

Open
rdbisme opened this issue Jan 26, 2025 · 11 comments
Open

Comments

@rdbisme
Copy link

rdbisme commented Jan 26, 2025

Problem description
I am the maintainer of the cantera-git AUR package, and I was trying to restore support to use system installed sundials. If I add sundials and not sundials-seq as dependency, your check here doesn't work, complaining about missing openmpi symbols.

I tried to check scons CheckLibWithHeader. It doesn't support providing extra_libs, while the underlying CheckLibs it uses, does. I think you need to pass -libmpi to the compiler command...

Steps to reproduce

  1. Open '...'
  2. Run '....'
  3. See error '....'

Behavior

System information

  • Cantera version: 9c3a57c
  • OS: Archlinux
  • Python/MATLAB/other software versions:

Attachments

Additional context

@rdbisme
Copy link
Author

rdbisme commented Jan 26, 2025

If anyone more proficient than me in scons knows how to fix that... I need to pass -lmpi to CheckLibWithHeader.

@rdbisme
Copy link
Author

rdbisme commented Jan 26, 2025

I've created this: SCons/scons#4676

@speth
Copy link
Member

speth commented Jan 26, 2025

Isn't this an issue with the AUR [1] build of SUNDIALS? If the sundials_core library has additional dependencies, those should be indicated by the sundials_core library. Downstream users like Cantera shouldn't have to explicitly link to all the dependencies of their first-tier dependencies. For instance, on Ubuntu, the libsundials_nvecmpimanyvector.so library indicates linkage to the MPI library:

$  ldd /usr/lib/x86_64-linux-gnu/libsundials_nvecmpimanyvector.so
        linux-vdso.so.1 (0x00007fff331a1000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f24d9509000)
        libmpi.so.40 => /lib/x86_64-linux-gnu/libmpi.so.40 (0x00007f24d93d7000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f24d91b7000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f24d9624000)
        libopen-rte.so.40 => /lib/x86_64-linux-gnu/libopen-rte.so.40 (0x00007f24d90fb000)
        libopen-pal.so.40 => /lib/x86_64-linux-gnu/libopen-pal.so.40 (0x00007f24d9045000)
        libhwloc.so.15 => /lib/x86_64-linux-gnu/libhwloc.so.15 (0x00007f24d8fe1000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f24d8fc3000)
        libevent_core-2.1.so.7 => /lib/x86_64-linux-gnu/libevent_core-2.1.so.7 (0x00007f24d8f8e000)
        libevent_pthreads-2.1.so.7 => /lib/x86_64-linux-gnu/libevent_pthreads-2.1.so.7 (0x00007f24d8f89000)
        libudev.so.1 => /lib/x86_64-linux-gnu/libudev.so.1 (0x00007f24d8f44000)
        libcap.so.2 => /lib/x86_64-linux-gnu/libcap.so.2 (0x00007f24d8f35000)

[1] https://aur.archlinux.org/ for anyone else like me who had to look this up

@rdbisme
Copy link
Author

rdbisme commented Jan 26, 2025

Mmmh, I think you still need to link against all dynamic libraries. There's no implicit transitive dependency management? https://stackoverflow.com/q/15923888

This is what it looks like on arch:

[arch@b3a972e23fcb ~]$ ldd /usr/lib/libsundials_core.so
        linux-vdso.so.1 (0x00007ffee62cc000)
        libm.so.6 => /usr/lib/libm.so.6 (0x00007fec007c2000)
        libmpi.so.40 => /usr/lib/libmpi.so.40 (0x00007fec0045f000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007fec0026e000)
        /usr/lib64/ld-linux-x86-64.so.2 (0x00007fec008c9000)
        libopen-pal.so.80 => /usr/lib/libopen-pal.so.80 (0x00007fec00195000)
        libfabric.so.1 => /usr/lib/libfabric.so.1 (0x00007febffd58000)
        libucp.so.0 => /usr/lib/libucp.so.0 (0x00007febffc03000)
        libucs.so.0 => /usr/lib/libucs.so.0 (0x00007febffa47000)
        libevent_core-2.1.so.7 => /usr/lib/libevent_core-2.1.so.7 (0x00007febffa15000)
        libevent_pthreads-2.1.so.7 => /usr/lib/libevent_pthreads-2.1.so.7 (0x00007febffa10000)
        libhwloc.so.15 => /usr/lib/libhwloc.so.15 (0x00007febff9ac000)
        libpmix.so.2 => /usr/lib/libpmix.so.2 (0x00007febff79c000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007febff76e000)
        libucm.so.0 => /usr/lib/libucm.so.0 (0x00007febff74f000)
        libuct.so.0 => /usr/lib/libuct.so.0 (0x00007febff708000)
        libnuma.so.1 => /usr/lib/libnuma.so.1 (0x00007febff6f2000)
        libsframe.so.1 => /usr/lib/libsframe.so.1 (0x00007febff6ea000)
        libz.so.1 => /usr/lib/libz.so.1 (0x00007febff6d1000)
        libzstd.so.1 => /usr/lib/libzstd.so.1 (0x00007febff5f0000)
        libudev.so.1 => /usr/lib/libudev.so.1 (0x00007febff5a9000)
        libcap.so.2 => /usr/lib/libcap.so.2 (0x00007febff59d000)

@speth
Copy link
Member

speth commented Jan 26, 2025

That output shows that the sundials_core library does link to libmpi, so there shouldn't be any missing symbols from libmpi indicated when linking Cantera. The system definitely does resolve transitive dependencies among shared libraries -- that's exactly what's being shown by the ldd output. Otherwise, you'd have to list all of those other libraries when linking to libcantera, which clearly isn't the requirement.

Please share the full output of the error you're getting after running scons build --config=force, including the contents of the config.log as this is where the output from the configure checks will be.

@rdbisme
Copy link
Author

rdbisme commented Jan 26, 2025

As a reference, can you give me the output of objdump -x /usr/lib/x86_64-linux-gnu/libsundials_nvecmpimanyvector.so | grep 'R.*PATH' from your system?

I just want to see if Ubuntu builds sets an rpath for it.

@rdbisme
Copy link
Author

rdbisme commented Jan 26, 2025

That output shows that the sundials_core library does link to libmpi, so there shouldn't be any missing symbols from libmpi indicated when linking Cantera.

Well, the compiler call misses the -lmpi, so it's normal, I believe, it complains about missing symbols.

@rdbisme
Copy link
Author

rdbisme commented Jan 26, 2025

scons: Reading SConscript files ...
SCons 4.8.1 is using the following Python interpreter:
    /usr/bin/python (Python 3.13)
INFO: Building Cantera from git commit '9c3a57c8b'
INFO: Compiling on 'Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz'
INFO: Configuration variables read from 'cantera.conf' and command line:
    prefix = '/home/arch/cantera/pkg/cantera-git/usr'
    system_eigen = 'y'
    system_sundials = 'y'
    sundials_include = '/usr/include/sundials'
    googletest = 'system'
    debug = False
    extra_inc_dirs = '/usr/include/eigen3'

Checking for C++ header file cmath... yes
Checking whether __clang__ is declared... no
INFO: Using system installation of fmt library.
INFO: Using fmt version 11.1.2
Checking for YAML::Node().Mark()... yes
INFO: Using system installation of yaml-cpp library.
Checking for C++ header file gtest/gtest.h... yes
Checking for C++ header file gmock/gmock.h... yes
INFO: Using system installation of Googletest
Checking for C++ header file eigen3/Eigen/Dense... yes
INFO: Using system installation of Eigen.
INFO: Found Eigen version 3.4.0
Checking whether __GLIBCXX__ is declared... yes
Checking for C++ library iomp5... no
Checking for C++ library omp... yes
INFO: Found Boost version 1.86
Checking for C library mkl_rt... no
Checking for C library openblas... no
Checking for C library lapack... yes
Checking for C library blas... yes
Checking for double x; log(x) in C library None... no
WARNING: Sundials version 7.2.1 has not been tested.
Checking for SUNContext ctx; SUNContext_Create(SUN_COMM_NULL, &ctx) in C++ library sundials_core... no
ERROR: Could not link to the Sundials library. Did you set the include/library paths?

See 'config.log' for details.
scons: Configure: Checking for SUNContext ctx; SUNContext_Create(SUN_COMM_NULL, &ctx) in C++ library sundials_core...
.sconf_temp/conftest_ce2dd1e38acb22ad88224c4d2c5f1d43_0.cpp <-
  |
  |
  |#include "cvodes/cvodes.h"
  |
  |int main(void) {
  |  SUNContext ctx; SUNContext_Create(SUN_COMM_NULL, &ctx);
  |return 0;
  |}
  |
g++ -o .sconf_temp/conftest_ce2dd1e38acb22ad88224c4d2c5f1d43_0.o -c -isystem /usr/include/eigen3 -isystem /usr/include/sundials -std=c++17 -pthread -O3 -Wno-inline -DNDEBUG .sconf_temp/conftest_ce2dd1e38acb22ad88224c4d2c5f1d43_0.cpp
g++ -o .sconf_temp/conftest_ce2dd1e38acb22ad88224c4d2c5f1d43_0_56a84ea5a960b9eb4ddf93eb378e815d -pthread .sconf_temp/conftest_ce2dd1e38acb22ad88224c4d2c5f1d43_0.o -lomp -lsundials_core
/usr/sbin/ld: .sconf_temp/conftest_ce2dd1e38acb22ad88224c4d2c5f1d43_0.o: undefined reference to symbol 'ompi_mpi_comm_null'
/usr/sbin/ld: /usr/lib/libmpi.so.40: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
scons: Configure: no

This compiler line:

g++ -o .sconf_temp/conftest_ce2dd1e38acb22ad88224c4d2c5f1d43_0_56a84ea5a960b9eb4ddf93eb378e815d -pthread .sconf_temp/conftest_ce2dd1e38acb22ad88224c4d2c5f1d43_0.o -lomp -lsundials_core

should be instead (I believe):

g++ -o .sconf_temp/conftest_ce2dd1e38acb22ad88224c4d2c5f1d43_0_56a84ea5a960b9eb4ddf93eb378e815d -pthread .sconf_temp/conftest_ce2dd1e38acb22ad88224c4d2c5f1d43_0.o -lomp -lsundials_core -lmpi

@speth
Copy link
Member

speth commented Jan 27, 2025

I was able to replicate this on Ubuntu 25.04 (pre-release), which has SUNDIALS 7.1.1. LLNL/sundials#464 seems to suggest that the MPI dependency has become a bit more pervasive in SUNDIALS 7.x. I think this is a bit unfortunate, as ideally software that uses only the the non-MPI parts of SUNDIALS ought not to need to understand anything about whether or not SUNDIALS was built with MPI support.

I think the usual recommended practice for compiling once MPI dependencies have gotten drug in is to use the mpicc and mpicxx wrappers to drive compilation, which you can do for Cantera by compiling with

scons build CC=mpicc CXX=mpicxx env_vars=all

This worked for me on the development Ubuntu version, where I also ran into issues with finding the mpi.h header file, which is otherwise installed in a location that isn't on the default compiler search path.

@rdbisme
Copy link
Author

rdbisme commented Jan 27, 2025

I can always depend on sundials-seq in Arch which, I tested, fixes the problem. Also I believe that if this lands, I can probably fix the check too.

So cantera doesn't support mpi?

@speth
Copy link
Member

speth commented Jan 27, 2025

Is it possible to have sundials-seq and sundials (with MPI) installed simultaneously? If so, then depending on the former would be ideal.

No, Cantera does not make use of MPI itself. The typical use within MPI applications is that each process creates and uses independent Cantera objects for calculations across an array of points.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants