Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Update meson.yml #180

Merged
merged 1 commit into from
Jan 16, 2024
Merged

[CI] Update meson.yml #180

merged 1 commit into from
Jan 16, 2024

Conversation

amontoison
Copy link
Member

No description provided.

@amontoison
Copy link
Member Author

@jfowkes
Copy link
Contributor

jfowkes commented Jan 11, 2024

@amontoison I would suggest that you do what the Makefile build system currently does and only use nvcc/nvc++ for building the CUDA sources as opposed to all of SPRAL.

@jfowkes jfowkes added the ci label Jan 11, 2024
@jfowkes jfowkes self-requested a review January 11, 2024 09:45
@mjacobse
Copy link
Collaborator

We are talking about

/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/share/llvm/bin/llc: error: /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/share/llvm/bin/llc: /tmp/nvc++ew2moH1KAcc.ll:10314:148: error: use of undefined value '%dblk'
        invoke void  @_ZN5spral5ssids3cpu17ldlt_app_internal5BlockIdLi32ENS1_14BuddyAllocatorIiSaIdEEEE6backupINS2_10CopyBackupIdNS4_IdS5_EEEEEEvRT_ (ptr %dblk, ptr  %22) mustprogress

right? This looks like a compiler bug to me. The error is raised by llc which compiles LLVMs intermediate representation to assembly, so long after dealing with the C++ code itself. Since the error appears to be about the hidden this argument for the member function Block<...>::backup<...>, perhaps using a free-function instead would be a workaround. But ideally I think this should be fixed in the compiler.

@jfowkes
Copy link
Contributor

jfowkes commented Jan 11, 2024

Thanks @mjacobse, agreed let's just use nvcc for the CUDA until this gets fixed.

@jfowkes
Copy link
Contributor

jfowkes commented Jan 11, 2024

Further to the above, Nick gets an NVFORTRAN-F-0000-Internal compiler error when compiling src/ssids/gpu/factor.f90. Googling this suggests that Nvidia have quite a lot of work to do to bring nvfortran up to spec (it's based on the old pgfortran compiler).

@amontoison
Copy link
Member Author

amontoison commented Jan 11, 2024

We are talking about

/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/share/llvm/bin/llc: error: /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/compilers/share/llvm/bin/llc: /tmp/nvc++ew2moH1KAcc.ll:10314:148: error: use of undefined value '%dblk'
        invoke void  @_ZN5spral5ssids3cpu17ldlt_app_internal5BlockIdLi32ENS1_14BuddyAllocatorIiSaIdEEEE6backupINS2_10CopyBackupIdNS4_IdS5_EEEEEEvRT_ (ptr %dblk, ptr  %22) mustprogress

right? This looks like a compiler bug to me. The error is raised by llc which compiles LLVMs intermediate representation to assembly, so long after dealing with the C++ code itself. Since the error appears to be about the hidden this argument for the member function Block<...>::backup<...>, perhaps using a free-function instead would be a workaround. But ideally I think this should be fixed in the compiler.

We also have an internal error in the same file for the icpc compiler on all platform.
icpx will not be released on Mac so it will never be fixed in the compiler.
If it's not a lot of work we should try to use the free-function.

@jfowkes
Copy link
Contributor

jfowkes commented Jan 12, 2024

Again that's an issue with icpc which has been deprecated and dropped from all the latest Intel compiler releases. There is no point doing lots of work just to support a deprecated compiler. Intel is no longer an option on macs unfortunately.

@amontoison amontoison force-pushed the nvidia-hpc branch 3 times, most recently from b68450e to 60b5817 Compare January 12, 2024 20:02
@amontoison amontoison changed the title [CI] Add NVIDIA-HPC compilers [CI] Update meson.yml Jan 12, 2024
@amontoison
Copy link
Member Author

@jfowkes I commented the failing platforms for CI.
If one day something is fixed upstream, we will be able to easily test the new version of the compilers.

I also added the OpenMP flag -mp for NVIDIA compilers in the main meson.build.

@jfowkes jfowkes added build-system and removed ci labels Jan 15, 2024
@jfowkes
Copy link
Contributor

jfowkes commented Jan 15, 2024

@amontoison looks great, are you happy for this to be merged? I am.

@jfowkes jfowkes removed their assignment Jan 15, 2024
@amontoison
Copy link
Member Author

@jfowkes Yes, we can merge the PR.

@jfowkes
Copy link
Contributor

jfowkes commented Jan 15, 2024

@amontoison great could you rebase master onto your branch just so we can see if it works with the other changes that you've made to meson.build in the other PR?

@amontoison
Copy link
Member Author

done 👍

@jfowkes jfowkes merged commit 8187c9e into ralna:master Jan 16, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants