Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flip rrtmgp to default to kokkos #2987

Merged
merged 5 commits into from
Sep 24, 2024
Merged

Conversation

jgfouca
Copy link
Member

@jgfouca jgfouca commented Sep 6, 2024

I had to reinstate some of the copies that were removed earlier because RRTMGP performs so much better with LayoutLeft. You can force LayoutRight, but it's off by default.

…_to_default_kokkos

* origin/master: (75 commits)
  Temporarily point to a COSP submodule branch with a bug fix
  Fix frontier builds
  Bump actions/setup-python from 5.1.1 to 5.2.0
  EAMxx: Fix PRIx64 usage to include width 16.
  Removes preprocess function from hpp and a calls from cpp interface files
  Replaces preprocess interface code
  Fixes path for files and change c-style strings to std::strings
  EAMxx: use portable format for unsigned std::int64_t in printf
  Adds units for mass and number ofr constituent fluxes
  Update shell_commands to remove spa
  Fixes namelist to enable fine grid simulations
  clarify comment for cmake option
  add more testmods and better comments
  fix typo in test mod
  add tests for small kernels
  options to build sk p3 and shoc separately
  Edits and adds some comments to provide clarity and TODOs
  Removes debug print statements
  Adds ne30pg2 emissions files and mapping files for fine resolutions
  Fixes multi process test a typo in aci interface
  ...
@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5818
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2987
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA c346b79
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 894dc7c
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6042
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2987
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA c346b79
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 894dc7c
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: jgfouca/flip_rrtmgp_to_default_kokkos
  • SHA: c346b79
  • Mode: TEST_REPO

Pull Request Author: jgfouca

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5818
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2987
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA c346b79
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 894dc7c
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6042
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2987
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA c346b79
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 894dc7c
TEST_REPO_ALIAS SCREAM
SCREAM_PullRequest_Autotester_Mappy # 5818 PASSED (click to see last 100 lines of console output)

Starting MODEL_BUILD for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 with 16 procs
Finished MODEL_BUILD for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5 in 495.641039 seconds (PASS)
Starting RUN for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5 in 9.171185 seconds (PEND). [COMPLETED 7 of 17]
Starting MODEL_BUILD for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5 with 16 procs
Finished MODEL_BUILD for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5 in 530.462567 seconds (PASS)
Starting RUN for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5 in 3.240573 seconds (PEND). [COMPLETED 8 of 17]
Starting MODEL_BUILD for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 with 16 procs
Finished MODEL_BUILD for test ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu in 430.724529 seconds (PASS)
Starting RUN for test ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu with 1 proc on interactive node and 16 procs on compute nodes
Finished RUN for test ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu in 9.282349 seconds (PEND). [COMPLETED 9 of 17]
Starting MODEL_BUILD for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97 with 16 procs
Finished MODEL_BUILD for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 777.490469 seconds (PASS)
Starting RUN for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 3.686192 seconds (PEND). [COMPLETED 10 of 17]
Starting MODEL_BUILD for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble with 16 procs
Finished MODEL_BUILD for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 in 429.282366 seconds (PASS)
Starting RUN for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 in 1.114268 seconds (PEND). [COMPLETED 11 of 17]
Starting MODEL_BUILD for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01 with 16 procs
Finished MODEL_BUILD for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5 in 450.573943 seconds (PASS)
Starting RUN for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5 in 5.146020 seconds (PEND). [COMPLETED 12 of 17]
Starting MODEL_BUILD for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 16 procs
Finished MODEL_BUILD for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97 in 367.482332 seconds (PASS)
Starting RUN for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97 with 1 proc on interactive node and 16 procs on compute nodes
Finished RUN for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97 in 1.041813 seconds (PEND). [COMPLETED 13 of 17]
Finished MODEL_BUILD for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble in 342.100642 seconds (PASS)
Starting RUN for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble with 1 proc on interactive node and 16 procs on compute nodes
Finished RUN for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble in 0.900127 seconds (PEND). [COMPLETED 14 of 17]
Finished MODEL_BUILD for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01 in 354.837311 seconds (PASS)
Starting RUN for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01 with 1 proc on interactive node and 16 procs on compute nodes
Finished RUN for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01 in 1.241946 seconds (PEND). [COMPLETED 15 of 17]
Finished MODEL_BUILD for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 273.706325 seconds (PASS)
Starting RUN for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 0.965112 seconds (PEND). [COMPLETED 16 of 17]
Finished MODEL_BUILD for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 in 703.310567 seconds (PASS)
Starting RUN for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 in 0.897752 seconds (PEND). [COMPLETED 17 of 17]
Waiting for tests to finish
PASS ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20240906_154223_vobunt
PASS ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4.C.20240906_154223_vobunt
PASS ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5.C.20240906_154223_vobunt
PASS ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5.C.20240906_154223_vobunt
PASS ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5.C.20240906_154223_vobunt
PASS ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5.C.20240906_154223_vobunt
PASS ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2.C.20240906_154223_vobunt
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97.C.20240906_154223_vobunt
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble.C.20240906_154223_vobunt
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01.C.20240906_154223_vobunt
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu.C.20240906_154223_vobunt
PASS PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20240906_154223_vobunt
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci.C.20240906_154223_vobunt
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-drydep RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-drydep.C.20240906_154223_vobunt
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics.C.20240906_154223_vobunt
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-wetscav RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-wetscav.C.20240906_154223_vobunt
PASS SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3.C.20240906_154223_vobunt
test-scheduler took 1600.5983440876007 seconds'
+ [[ 0 != 0 ]]
+ set +x
$ ssh-agent -k
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 7039 killed;
[ssh-agent] Stopped.
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh

We're having issues with some test-launcher job hanging forever. So let's make sure we clean all penting test-launcher jobs

squeue -o"%.7i %u %40j" | grep e3sm-jenkins | grep test-launcher | awk '{ print $1 }' | xargs -r scancel

[SCREAM_PullRequest_Autotester_Mappy] $ /bin/bash -le /tmp/jenkins926845577214247390.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Sending e-mails to: [email protected]
Finished: SUCCESS

SCREAM_PullRequest_Autotester_Weaver # 6042 FAILED (click to see last 100 lines of console output)

p3_sk                      =  26.63 sec*proc (2 tests)
physics                    = 190.64 sec*proc (27 tests)
remap                      =   3.42 sec*proc (1 test)
rrtmgp                     =  50.85 sec*proc (11 tests)
shoc                       =  62.53 sec*proc (13 tests)
spa                        =   9.49 sec*proc (4 tests)
surface_coupling           =   5.18 sec*proc (1 test)

Total Test time (real) = 773.11 sec

The following tests FAILED:
149 - homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp (Failed)
CMake Error at /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/cmake/ctest_script.cmake:76 (message):
Test had fails

===============================================================================
Testing '''c346b79fe9162b73d897e9ace9e3e6d1a8d05935''' for test '''full_sp_debug'''

RUN: taskset -c 52-103 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/ctest-build/full_sp_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/ctest-build/full_sp_debug -DBUILD_NAME_MOD=full_sp_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DSCREAM_DOUBLE_PRECISION=False -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_sp_debug" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/ctest-build/full_sp_debug

Testing '''c346b79fe9162b73d897e9ace9e3e6d1a8d05935''' for test '''full_debug'''

RUN: taskset -c 0-51 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/ctest-build/full_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/ctest-build/full_debug -DBUILD_NAME_MOD=full_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DKokkos_ENABLE_DEBUG_BOUNDS_CHECK=True -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_debug" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/ctest-build/full_debug

Testing '''c346b79fe9162b73d897e9ace9e3e6d1a8d05935''' for test '''release'''

RUN: taskset -c 104-155 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/ctest-build/release/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/ctest-build/release -DBUILD_NAME_MOD=release -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Release -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/release" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx/ctest-build/release
Build type full_debug failed at testing time. Here'''s a list of failed tests:
149:homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp

Build type release failed at testing time. Here'''s a list of failed tests:
148:homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx
weaver failed'

  • errors='Build type full_debug failed at testing time. Here'''s a list of failed tests:
    149:homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp

Build type release failed at testing time. Here'''s a list of failed tests:
148:homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx
weaver failed'

  • SA_FAILURES_DETAILS+='Build type full_debug failed at testing time. Here'''s a list of failed tests:
    149:homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp

Build type release failed at testing time. Here'''s a list of failed tests:
148:homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx
weaver failed'

  • [[ 1 == 0 ]]
  • [[ weaver == \m\a\p\p\y ]]
  • set +x
    ######################################################
    FAILS DETECTED:
    SCREAM STANDALONE TESTING FAILED!
    Build type full_debug failed at testing time. Here's a list of failed tests:
    149:homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp

Build type release failed at testing time. Here's a list of failed tests:
148:homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6042/scream/components/eamxx
weaver failed
######################################################
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh
[SCREAM_PullRequest_Autotester_Weaver] $ /bin/bash -le /tmp/jenkins16294247700438645949.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Sending e-mails to: [email protected]
Finished: FAILURE

@AaronDonahue
Copy link
Contributor

@jgfouca , did you want to add labels to this PR so it can automerge once testing is complete?

@jgfouca
Copy link
Member Author

jgfouca commented Sep 16, 2024

@AaronDonahue , I need to have @brhillman help me verify that the weaver DIFF is OK.

@PeterCaldwell
Copy link
Contributor

@jgfouca - Ben is on vacation this week so don't hold your breath.

@AaronDonahue
Copy link
Contributor

Sounds good, we can wait on this PR until @brhillman gets back from vacation. No worries.

@jgfouca
Copy link
Member Author

jgfouca commented Sep 24, 2024

@brhillman and I think that the weaver diff for homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp is OK and just the result of roundoff/order-of-operations changes. Merging.

@jgfouca jgfouca merged commit 7bbdcb8 into master Sep 24, 2024
5 of 6 checks passed
@jgfouca jgfouca deleted the jgfouca/flip_rrtmgp_to_default_kokkos branch September 24, 2024 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants