Flush and close output files that are full #3032

AaronDonahue · 2024-10-07T18:42:57Z

This commit causes the output manager to flush and close a file if the max number of snapshots has been reached.

This is mainly an issue when the simulation exits abnormally before SCORPIO has flushed a file. The result is that the output file exists, but is empty. Now, the output manager will check if the maximum number of snapshots allowable in the file has been reached, and if so it forces the file to be flushed and closed.

Fixes #3026

This commit addresses an issue with empty output files that should have been flushed and closed. If the simulation exits abnormally before a file has been flushed then the file will be empty. Before, we left it up to SCORPIO to decide the optimal time to flush output. In this commit we force a file that is full, i.e. max_snapshots has been reached, to be flushed and closed before moving on. This should ensure that all full files are written before any chance of an abnormal exit.

AaronDonahue · 2024-10-07T18:44:35Z

@bartgol I noticed you already had a is_file_full check in the filespecs for IO but it was commented out. So I revived that and used it here to force the flushing. I tested using my other branch from #3032 and it did indeed populate the file that was empty before.

Since this PR is dependent on #3031 I am labeling it as WIP until that is merged. But in the meantime I wanted to solicit any comments on this approach.

github-actions · 2024-10-07T18:47:38Z

PR Preview Action v1.4.8
🚀 Deployed preview to https://E3SM-Project.github.io/scream/pr-preview/pr-3032/
on branch `gh-pages` at 2024-10-16 21:10 UTC

bartgol · 2024-10-07T19:19:30Z

@bartgol I noticed you already had a is_file_full check in the filespecs for IO but it was commented out. So I revived that and used it here to force the flushing. I tested using my other branch from #3032 and it did indeed populate the file that was empty before.

Since this PR is dependent on #3032 I am labeling it as WIP until that is merged. But in the meantime I wanted to solicit any comments on this approach.

I'm guessing you wanted to link a different PR? 3032 is this PR...

bartgol

I think we need to use snapshot_fits to accommodate also a different storage type.

bartgol · 2024-10-07T19:20:41Z

components/eamxx/src/share/io/scream_io_file_specs.hpp

@@ -86,7 +86,10 @@ struct IOFileSpecs {
  // If positive, flush the output file every these many snapshots
  int flush_frequency = std::numeric_limits<int>::max();

-  // bool file_is_full () const { return num_snapshots_in_file>=max_snapshots_in_file; }
+  bool file_is_full () const { 
+    return storage.num_snapshots_in_file>=storage.max_snapshots_in_file; 


This only works if the storage type is based on max number of snapshots. It will not work in case one chooses "one_month" or "one_year" (the former being quite appealing for single month data).

The fact that this only works for max-snaps based storage may very well be the case why it was commented out. I believe I switched to using snapshot_fits precisely for this reason.

This is an excellent point. Thank you for pointing this out.

If the simulation writes a restart file and the rpointers are all consistent, and then the simulation fails, will a file that is still open (for a year, for example) still have the risk of containing 0s instead of flushed data in a time period that is prior to the valid restart? I.e., could we still end up with 0 data?

Perhaps this is already done, but it seems to me the only way to guarantee things is by flushing all files prior to writing the rpointer file.

@ambrad we also have a separate todo item: every time we write a rhist file, also flush the corresponding output file.

(And the other todo item is significantly more important than this fwiw)

@ambrad was this just a question to ensure we're not forgetting anything, or do you have a scenario where you think that would/should be the case (so flushing at rhist write would not be enough)?

Could we have a flush_all type call that is initiated whenever a restart is written? Wouldn't that remedy the concern of having all 0's if a fail occurs after a restart is written?

@bartgol, right, to ensure we're not forgetting anything. But I agree with Aaron that if the AD has a list of open write-mode files, why not just iterate through the list and flush them all before writing the rpointer file? You'd then skip more file-specific flushing. I don't think there can be an open write-mode file that shouldn't be flushed at a restart write.

I don't think the AD has a handle to all files directly. But each output manger has a handle to its output and rhist files, so when the rhist file is written, it can flush the output one too.

And as Naser said, in EAMxx (for some technical reasons) we always write a rhist file, even if there is no "restart data" (e.g., for INSTANT output) and even if we just wrote in the output file. So flushing the .h file when the corresponding .rhist is flushed, should cover every output file.

bartgol · 2024-10-07T19:27:16Z

components/eamxx/src/share/io/scream_output_manager.cpp

@@ -550,6 +550,12 @@ void OutputManager::run(const util::TimeStamp& timestamp)
      if (filespecs.file_needs_flush()) {
        flush_file (filespecs.filename);
      }
+
+      // Check if we have hit the max number of snapshots and need to close the file
+      if (filespecs.file_is_full()) {


Instead of using this method, we should use

if (filespecs.storage.snapshot_fits (m_output_control.next_write_ts))

Caveat: you need to ensure that m_output_control.compute_next_write_ts() has been called before you attempt to use next_write_ts. I believe at the point where you added these mods, this will be the case, but you may want to double check.

yes, just confirmed that compute_next_write_ts() is called before this chunk of code.

@bartgol do you mean the negation of snapshot_fits?

Ah, yes, good call. @AaronDonahue see andrew's comment, so don't just copy paste the line I wrote.

bartgol · 2024-10-07T19:30:27Z

One more thing: if you switch to closing the file as soon as it's full, then you can also get rid of these lines

    if (filespecs.is_open and not filespecs.storage.snapshot_fits(snapshot_start)) {
      release_file(filespecs.filename);
      filespecs.close();
    }

since we should never hit this scenario anymore.

AaronDonahue · 2024-10-07T19:33:27Z

@bartgol I noticed you already had a is_file_full check in the filespecs for IO but it was commented out. So I revived that and used it here to force the flushing. I tested using my other branch from #3032 and it did indeed populate the file that was empty before.
Since this PR is dependent on #3032 I am labeling it as WIP until that is merged. But in the meantime I wanted to solicit any comments on this approach.

I'm guessing you wanted to link a different PR? 3032 is this PR...

doh! Too many open windows next to each other. Thanks for the correction, I meant 3031

…r file storage types

AaronDonahue · 2024-10-07T22:21:07Z

One more thing: if you switch to closing the file as soon as it's full, then you can also get rid of these lines
    if (filespecs.is_open and not filespecs.storage.snapshot_fits(snapshot_start)) {
      release_file(filespecs.filename);
      filespecs.close();
    }
since we should never hit this scenario anymore.

@bartgol do I need the filespecs.close() line for my changes?

bartgol · 2024-10-08T00:00:04Z

One more thing: if you switch to closing the file as soon as it's full, then you can also get rid of these lines
    if (filespecs.is_open and not filespecs.storage.snapshot_fits(snapshot_start)) {
      release_file(filespecs.filename);
      filespecs.close();
    }
since we should never hit this scenario anymore.
@bartgol do I need the filespecs.close() line for my changes?

Aren't you closing the file when you first find out it was full?

E3SM-Autotester · 2024-10-08T12:04:27Z

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Build Num: 6128
Status: STARTED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`eabd70f`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Build Num: 5890
Status: STARTED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`eabd70f`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)

Pull Request Author: AaronDonahue

E3SM-Autotester · 2024-10-08T13:19:32Z

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Build Num: 6128
Status: FAILED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`eabd70f`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Build Num: 5890
Status: FAILED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`eabd70f`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

SCREAM_PullRequest_Autotester_Weaver # 6128 FAILED (click to see last 100 lines of console output)


CMake Error at /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/cmake/ctest_script.cmake:76 (message):
  Test had fails
===============================================================================

Testing '''49f82de844eac302dc95b7489a657e3174301205''' for test '''full_sp_debug'''
RUN: taskset -c 52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/ctest-build/full_sp_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/ctest-build/full_sp_debug -DBUILD_NAME_MOD=full_sp_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DSCREAM_DOUBLE_PRECISION=False -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_sp_debug" '''

FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/ctest-build/full_sp_debug
Testing '''49f82de844eac302dc95b7489a657e3174301205''' for test '''release'''
RUN: taskset -c 104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/ctest-build/release/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/ctest-build/release -DBUILD_NAME_MOD=release -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Release -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/release" '''

FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/ctest-build/release
Testing '''49f82de844eac302dc95b7489a657e3174301205''' for test '''full_debug'''
RUN: taskset -c 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/ctest-build/full_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/ctest-build/full_debug -DBUILD_NAME_MOD=full_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DKokkos_ENABLE_DEBUG_BOUNDS_CHECK=True -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_debug" '''

FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx/ctest-build/full_debug

Build type full_debug failed at testing time. Here'''s a list of failed tests:

4:io_monthly_np1
Build type full_sp_debug failed at testing time. Here'''s a list of failed tests:

4:io_monthly_np1
Build type release failed at testing time. Here'''s a list of failed tests:

4:io_monthly_np1
Error(s) occurred during test phase

OVERALL STATUS: FAIL

Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver

RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver

FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx

weaver failed'

errors='Build type full_debug failed at testing time. Here'''s a list of failed tests:

4:io_monthly_np1

Build type full_sp_debug failed at testing time. Here'''s a list of failed tests:

4:io_monthly_np1
Build type release failed at testing time. Here'''s a list of failed tests:

4:io_monthly_np1
Error(s) occurred during test phase

OVERALL STATUS: FAIL

Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver

RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver

FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx

weaver failed'

SA_FAILURES_DETAILS+='Build type full_debug failed at testing time. Here'''s a list of failed tests:

4:io_monthly_np1

Build type full_sp_debug failed at testing time. Here'''s a list of failed tests:

4:io_monthly_np1
Build type release failed at testing time. Here'''s a list of failed tests:

4:io_monthly_np1
Error(s) occurred during test phase

OVERALL STATUS: FAIL

Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver

RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver

FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx

weaver failed'

[[ 1 == 0 ]]
[[ weaver == \m\a\p\p\y ]]
set +x

######################################################

FAILS DETECTED:

SCREAM STANDALONE TESTING FAILED!

Build type full_debug failed at testing time. Here's a list of failed tests:

4:io_monthly_np1

Build type full_sp_debug failed at testing time. Here's a list of failed tests:

4:io_monthly_np1
Build type release failed at testing time. Here's a list of failed tests:

4:io_monthly_np1
Error(s) occurred during test phase

OVERALL STATUS: FAIL

Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver

RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver

FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6128/scream/components/eamxx

weaver failed

######################################################

Build step 'Execute shell' marked build as failure

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash -le
cd $WORKSPACE/${BUILD_ID}/
./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh

[SCREAM_PullRequest_Autotester_Weaver] $ /bin/bash -le /tmp/jenkins8600751239369240865.sh

POST BUILD TASK : SUCCESS

END OF POST BUILD TASK : 0

Sending e-mails to: [email protected]

Finished: FAILURE

SCREAM_PullRequest_Autotester_Mappy # 5890 FAILED (click to see last 100 lines of console output)


Starting RUN for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 1.777873 seconds (PEND). [COMPLETED 16 of 17]
Finished MODEL_BUILD for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 in 770.873622 seconds (PASS)
Starting RUN for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 in 0.733301 seconds (PEND). [COMPLETED 17 of 17]
Waiting for tests to finish
PASS ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20241008_064318_zhxn1f
PASS ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4.C.20241008_064318_zhxn1f
PASS ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5.C.20241008_064318_zhxn1f
PASS ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5 RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5.C.20241008_064318_zhxn1f
PASS ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5 RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5.C.20241008_064318_zhxn1f
PASS ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5 RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5.C.20241008_064318_zhxn1f
PASS ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2.C.20241008_064318_zhxn1f
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97 RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97.C.20241008_064318_zhxn1f
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble.C.20241008_064318_zhxn1f
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01 RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01.C.20241008_064318_zhxn1f
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu.C.20241008_064318_zhxn1f
PASS PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20241008_064318_zhxn1f
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci.C.20241008_064318_zhxn1f
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-drydep RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-drydep.C.20241008_064318_zhxn1f
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics.C.20241008_064318_zhxn1f
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-wetscav RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-wetscav.C.20241008_064318_zhxn1f
PASS SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 RUN
    Case dir: /home/e3sm-jenkins/acme/scratch/SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3.C.20241008_064318_zhxn1f
test-scheduler took 2090.8968183994293 seconds'
+ [[ 0 != 0 ]]
+ set +x
######################################################
FAILS DETECTED:
  SCREAM STANDALONE TESTING FAILED!
Build type full_debug failed at testing time. Here's a list of failed tests:
10:io_monthly_np1
11:io_monthly_np2
12:io_monthly_np3
13:io_monthly_np4
Build type full_sp_debug failed at testing time. Here's a list of failed tests:

10:io_monthly_np1

11:io_monthly_np2

12:io_monthly_np3

13:io_monthly_np4
Build type debug_nopack_fpe failed at testing time. Here's a list of failed tests:

10:io_monthly_np1

11:io_monthly_np2

12:io_monthly_np3

13:io_monthly_np4
Build type release failed at testing time. Here's a list of failed tests:

10:io_monthly_np1

11:io_monthly_np2

12:io_monthly_np3

13:io_monthly_np4
Error(s) occurred during test phase

OVERALL STATUS: FAIL

Starting analysis on mappy with cmd: cd /home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5890/scream/components/eamxx && source /projects/sems/modulefiles/utils/sems-modules-init.sh && module purge && module load sems-cmake/3.27.9 sems-git/2.42.0 sems-gcc/11.4.0 sems-openmpi-no-cuda/4.1.6 sems-netcdf-c/4.9.2 sems-netcdf-cxx/4.2 sems-netcdf-fortran/4.6.1 sems-parallel-netcdf/1.12.3 sems-openblas && export GATOR_INITIAL_MB=4000MB && export OMP_PROC_BIND=spread && true &&  ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m mappy

RUN: cd /home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5890/scream/components/eamxx && source /projects/sems/modulefiles/utils/sems-modules-init.sh && module purge && module load sems-cmake/3.27.9 sems-git/2.42.0 sems-gcc/11.4.0 sems-openmpi-no-cuda/4.1.6 sems-netcdf-c/4.9.2 sems-netcdf-cxx/4.2 sems-netcdf-fortran/4.6.1 sems-parallel-netcdf/1.12.3 sems-openblas && export GATOR_INITIAL_MB=4000MB && export OMP_PROC_BIND=spread && true &&  ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m mappy

FROM: /home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5890/scream/components/eamxx

mappy failed

######################################################

Build step 'Execute shell' marked build as failure

$ ssh-agent -k

unset SSH_AUTH_SOCK;

unset SSH_AGENT_PID;

echo Agent pid 3743062 killed;

[ssh-agent] Stopped.

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash -le
cd $WORKSPACE/${BUILD_ID}/
./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh
We're having issues with some test-launcher job hanging forever. So let's make sure we clean all penting test-launcher jobs
squeue -o"%.7i %u %40j" | grep e3sm-jenkins | grep test-launcher | awk '{ print $1 }' | xargs -r scancel
[SCREAM_PullRequest_Autotester_Mappy] $ /bin/bash -le /tmp/jenkins11948053557837947249.sh

POST BUILD TASK : SUCCESS

END OF POST BUILD TASK : 0

Sending e-mails to: [email protected]

Finished: FAILURE

AaronDonahue · 2024-10-08T16:03:45Z

One more thing: if you switch to closing the file as soon as it's full, then you can also get rid of these lines
    if (filespecs.is_open and not filespecs.storage.snapshot_fits(snapshot_start)) {
      release_file(filespecs.filename);
      filespecs.close();
    }
since we should never hit this scenario anymore.
@bartgol do I need the filespecs.close() line for my changes?
Aren't you closing the file when you first find out it was full?

Oh duh. Sorry, when I read this line just to delete it I thought it was "closing the FileSpecs" object which I didn't remember doing. But yes, I already have this line...

AaronDonahue · 2024-10-08T16:06:11Z

Looks like we have a fail. I'll investigate

E3SM-Autotester · 2024-10-11T15:09:24Z

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Build Num: 6151
Status: STARTED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Build Num: 5908
Status: STARTED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)

Pull Request Author: AaronDonahue

E3SM-Autotester · 2024-10-11T16:20:17Z

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Build Num: 6151
Status: PASSED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Build Num: 5908
Status: FAILED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

SCREAM_PullRequest_Autotester_Weaver # 6151 PASSED (click to see last 100 lines of console output)


        Start 143: model_restart
143/157 Test #143: model_restart .........................................................   Passed    7.09 sec
        Start 144: restarted_vs_monolithic_check_np1
144/157 Test #144: restarted_vs_monolithic_check_np1 .....................................   Passed    0.13 sec
        Start 145: homme_shoc_cld_spa_p3_rrtmgp_np1
145/157 Test #145: homme_shoc_cld_spa_p3_rrtmgp_np1 ......................................   Passed    6.17 sec
        Start 146: homme_shoc_cld_spa_p3_rrtmgp_baseline_cmp
146/157 Test #146: homme_shoc_cld_spa_p3_rrtmgp_baseline_cmp .............................   Passed    0.12 sec
        Start 147: homme_shoc_cld_spa_p3_rrtmgp_128levels_np1
147/157 Test #147: homme_shoc_cld_spa_p3_rrtmgp_128levels_np1 ............................   Passed    8.70 sec
        Start 148: homme_shoc_cld_spa_p3_rrtmgp_128levels_tend_check_np1
148/157 Test #148: homme_shoc_cld_spa_p3_rrtmgp_128levels_tend_check_np1 .................   Passed    1.42 sec
        Start 149: homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp
149/157 Test #149: homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp ...................   Passed    0.65 sec
        Start 150: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_np1
150/157 Test #150: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_np1 ...............................   Passed   13.02 sec
        Start 151: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_baseline_cmp
151/157 Test #151: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_baseline_cmp ......................   Passed    0.11 sec
        Start 152: homme_shoc_cld_p3_mam_optics_rrtmgp_np1
152/157 Test #152: homme_shoc_cld_p3_mam_optics_rrtmgp_np1 ...............................   Passed   16.52 sec
        Start 153: homme_shoc_cld_p3_mam_optics_rrtmgp_baseline_cmp
153/157 Test #153: homme_shoc_cld_p3_mam_optics_rrtmgp_baseline_cmp ......................   Passed    0.13 sec
        Start 154: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_np1
154/157 Test #154: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_np1 ............   Passed   17.64 sec
        Start 155: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_baseline_cmp
155/157 Test #155: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_baseline_cmp ...   Passed    0.23 sec
        Start 156: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_np1
156/157 Test #156: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_np1 .........................   Passed   37.38 sec
        Start 157: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_baseline_cmp
157/157 Test #157: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_baseline_cmp ................   Passed    0.14 sec
100% tests passed, 0 tests failed out of 157
Label Time Summary:

baseline_cmp               = 139.27 secproc (23 tests)

baseline_gen               = 337.32 secproc (25 tests)

bfbhash                    =   0.89 secproc (1 test)

check                      =   0.95 secproc (1 test)

cld                        =  44.63 secproc (7 tests)

cld_fraction               =   4.25 secproc (1 test)

cxx baseline_cmp           =  10.10 secproc (2 tests)

diagnostics                =  42.67 secproc (23 tests)

driver                     =  95.77 secproc (16 tests)

dynamics                   =   8.13 secproc (3 tests)

fail                       =  30.03 secproc (5 tests)

io                         =  57.02 secproc (14 tests)

mam4_aci                   =  23.31 secproc (4 tests)

mam4_constituent_fluxes    =   7.55 secproc (1 test)

mam4_drydep                =   3.63 secproc (1 test)

mam4_optics                =   4.06 secproc (1 test)

mam4_srf_online_emiss      =   7.55 secproc (1 test)

mam4_wetscav               =  24.65 secproc (2 tests)

nudging                    =  11.94 secproc (2 tests)

p3                         = 111.84 secproc (12 tests)

p3_sk                      =  31.68 secproc (2 tests)

physics                    = 189.05 secproc (27 tests)

remap                      =   5.68 secproc (1 test)

rrtmgp                     =  43.12 secproc (11 tests)

shoc                       =  59.08 secproc (13 tests)

spa                        =  11.40 secproc (4 tests)

surface_coupling           =   4.28 sec*proc (1 test)
Total Test time (real) = 816.28 sec
Testing '''296cfb1368a106ccc6f0084ca29f00cb68ae5fa1''' for test '''full_sp_debug'''
RUN: taskset -c 52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/ctest-build/full_sp_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/ctest-build/full_sp_debug -DBUILD_NAME_MOD=full_sp_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DSCREAM_DOUBLE_PRECISION=False -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_sp_debug" '''

FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/ctest-build/full_sp_debug
Testing '''296cfb1368a106ccc6f0084ca29f00cb68ae5fa1''' for test '''full_debug'''
RUN: taskset -c 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/ctest-build/full_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/ctest-build/full_debug -DBUILD_NAME_MOD=full_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DKokkos_ENABLE_DEBUG_BOUNDS_CHECK=True -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_debug" '''

FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/ctest-build/full_debug
Testing '''296cfb1368a106ccc6f0084ca29f00cb68ae5fa1''' for test '''release'''
RUN: taskset -c 104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/ctest-build/release/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/ctest-build/release -DBUILD_NAME_MOD=release -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Release -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/release" '''

FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx/ctest-build/release

OVERALL STATUS: PASS

Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver

RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver

FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6151/scream/components/eamxx

Completed analysis on weaver'

[[ 0 != 0 ]]
[[ 1 == 0 ]]
[[ weaver == \m\a\p\p\y ]]
set +x

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/
./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh

[SCREAM_PullRequest_Autotester_Weaver] $ /bin/bash -le /tmp/jenkins1216082531371680173.sh

POST BUILD TASK : SUCCESS

END OF POST BUILD TASK : 0

Sending e-mails to: [email protected]

Finished: SUCCESS

SCREAM_PullRequest_Autotester_Mappy # 5908 FAILED (click to see last 100 lines of console output)


[ 63%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_cloud_rain_acc.cpp.o
[ 63%] Linking CXX static library libnudging.a
[ 63%] Building CXX object src/physics/shoc/CMakeFiles/shoc_sk.dir/disp/shoc_compute_shoc_temperature_disp.cpp.o
[ 63%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_calc_rime_density.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc_sk.dir/disp/shoc_diag_obklen_disp.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_cldliq_imm_freezing.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_rain_imm_freezing.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc.dir/eti/shoc_pblintd_cldcheck.cpp.o
[ 64%] Built target nudging
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_droplet_self_coll.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_evaporate_rain.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc.dir/eti/shoc_pblintd_height.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3_sk.dir/eti/p3_prevent_liq_supersaturation.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_impose_max_total_ni.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc_sk.dir/disp/shoc_pblintd_disp.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_calc_liq_relaxation_timescale.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc_sk.dir/disp/shoc_length_disp.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3_sk.dir/disp/p3_check_values_impl_disp.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3_sk.dir/disp/p3_ice_sed_impl_disp.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc_sk.dir/disp/shoc_tke_disp.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc_sk.dir/disp/shoc_update_prognostics_implicit_disp.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3_sk.dir/disp/p3_main_impl_part1_disp.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_ice_relaxation_timescale.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_ice_nucleation.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3_sk.dir/disp/p3_main_impl_part3_disp.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc_sk.dir/disp/shoc_diag_second_shoc_moments_disp.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc.dir/eti/shoc_pblintd_init_pot.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc_sk.dir/disp/shoc_diag_third_shoc_moments_disp.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc.dir/eti/shoc_pblintd_surf_temp.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc.dir/eti/shoc_tke.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc.dir/eti/shoc_tridiag_solver.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc.dir/eti/shoc_update_host_dse.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_ice_cldliq_wet_growth.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc_sk.dir/disp/shoc_assumed_pdf_disp.cpp.o
[ 64%] Building CXX object src/physics/shoc/CMakeFiles/shoc_sk.dir/disp/shoc_update_host_dse_disp.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_check_values.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_incloud_mixingratios.cpp.o
[ 64%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_subgrid_variance_scaling.cpp.o
[ 65%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_main.cpp.o
[ 65%] Building CXX object src/physics/shoc/CMakeFiles/shoc.dir/eti/shoc_update_prognostics_implicit.cpp.o
[ 65%] Building CXX object src/physics/p3/CMakeFiles/p3_sk.dir/disp/p3_cloud_sed_impl_disp.cpp.o
[ 65%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_main_part1.cpp.o
[ 65%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_main_part2.cpp.o
[ 65%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_main_part3.cpp.o
[ 65%] Building Fortran object src/physics/shoc/CMakeFiles/shoc_sk.dir/home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5908/scream/components/eam/src/physics/cam/shoc.F90.o
[ 65%] Building Fortran object src/physics/shoc/CMakeFiles/shoc.dir/home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5908/scream/components/eam/src/physics/cam/shoc.F90.o
[ 65%] Building CXX object src/physics/p3/CMakeFiles/p3_sk.dir/disp/p3_main_impl_disp.cpp.o
[ 65%] Building CXX object src/physics/p3/CMakeFiles/p3_sk.dir/disp/p3_main_impl_part2_disp.cpp.o
[ 65%] Building CXX object src/physics/p3/CMakeFiles/p3_sk.dir/disp/p3_rain_sed_impl_disp.cpp.o
[ 65%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_ice_supersat_conservation.cpp.o
[ 66%] Building Fortran object src/physics/p3/CMakeFiles/p3_sk.dir/p3_iso_c.f90.o
[ 66%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_nc_conservation.cpp.o
[ 66%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_nr_conservation.cpp.o
[ 66%] Building Fortran object src/physics/shoc/CMakeFiles/shoc.dir/shoc_iso_c.f90.o
[ 66%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_ni_conservation.cpp.o
[ 66%] Building CXX object src/physics/p3/CMakeFiles/p3.dir/eti/p3_prevent_liq_supersaturation.cpp.o
[ 66%] Building Fortran object src/physics/shoc/CMakeFiles/shoc_sk.dir/shoc_iso_c.f90.o
[ 66%] Building Fortran object src/physics/p3/CMakeFiles/p3.dir/p3_iso_c.f90.o
[ 66%] Linking CXX static library libshoc_sk.a
[ 66%] Linking CXX static library libshoc.a
[ 66%] Built target shoc_sk
[ 66%] Linking CXX static library libp3_sk.a
[ 66%] Linking CXX static library libp3.a
[ 66%] Built target shoc
[ 66%] Built target p3
[ 66%] Built target p3_sk
[ 66%] Linking CXX static library libmam.a
[ 66%] Built target mam
gmake: *** [Makefile:166: all] Error 2
Error(s) occurred during test phase

OVERALL STATUS: FAIL

Starting analysis on mappy with cmd: cd /home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5908/scream/components/eamxx && source /projects/sems/modulefiles/utils/sems-modules-init.sh && module purge && module load sems-cmake/3.27.9 sems-git/2.42.0 sems-gcc/11.4.0 sems-openmpi-no-cuda/4.1.6 sems-netcdf-c/4.9.2 sems-netcdf-cxx/4.2 sems-netcdf-fortran/4.6.1 sems-parallel-netcdf/1.12.3 sems-openblas && export GATOR_INITIAL_MB=4000MB && export OMP_PROC_BIND=spread && true &&  ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m mappy

RUN: cd /home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5908/scream/components/eamxx && source /projects/sems/modulefiles/utils/sems-modules-init.sh && module purge && module load sems-cmake/3.27.9 sems-git/2.42.0 sems-gcc/11.4.0 sems-openmpi-no-cuda/4.1.6 sems-netcdf-c/4.9.2 sems-netcdf-cxx/4.2 sems-netcdf-fortran/4.6.1 sems-parallel-netcdf/1.12.3 sems-openblas && export GATOR_INITIAL_MB=4000MB && export OMP_PROC_BIND=spread && true &&  ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m mappy

FROM: /home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5908/scream/components/eamxx

mappy failed

######################################################

Build step 'Execute shell' marked build as failure

$ ssh-agent -k

unset SSH_AUTH_SOCK;

unset SSH_AGENT_PID;

echo Agent pid 921 killed;

[ssh-agent] Stopped.

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash -le
cd $WORKSPACE/${BUILD_ID}/
./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh
We're having issues with some test-launcher job hanging forever. So let's make sure we clean all penting test-launcher jobs
squeue -o"%.7i %u %40j" | grep e3sm-jenkins | grep test-launcher | awk '{ print $1 }' | xargs -r scancel
[SCREAM_PullRequest_Autotester_Mappy] $ /bin/bash -le /tmp/jenkins8936001311501249363.sh

POST BUILD TASK : SUCCESS

END OF POST BUILD TASK : 0

Sending e-mails to: [email protected]

Finished: FAILURE

E3SM-Autotester · 2024-10-15T15:10:52Z

Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing.

E3SM-Autotester · 2024-10-15T15:13:55Z

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Build Num: 6157
Status: STARTED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;AT: RETEST;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Build Num: 5913
Status: STARTED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;AT: RETEST;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)

Pull Request Author: AaronDonahue

mahf708

Minor comments, mainly to push you to clarify changes/comments.

Also, would you please rebase this PR if more commits are needed? That way we could get rid of that already merged changes from showing up here...

components/eamxx/src/share/io/scream_io_control.hpp

mahf708 · 2024-10-15T15:23:18Z

components/eamxx/src/share/io/scream_io_control.hpp

+
+  void set_dt (const double dt_in) {
+    EKAT_REQUIRE_MSG (dt==0 or dt==dt_in,
+        "[IOControl::set_dt] Error! Cannot reset dt once it is set.\n");


Like above.

mahf708 · 2024-10-15T15:27:29Z

components/eamxx/src/share/io/scream_output_manager.cpp

+    // computed next_write_ts=last_write_ts (in terms of date:time, the num_steps is correct).
+    // This means that at that time we deemed that the next_write_ts definitely fit in the same
+    // file as last_write_ts (date/time are the same!), which may or may not be true for non NumSnaps
+    // storage. To fix this, we recompute next_write_ts here, and close the file if it doesn't.


minor: doesn't ... what? fit?

Overall, this seems a bit too complex? Some questions:

when did we the "currently open file"?

why can't we have all the info we need to determine if we can close it after write? If we are deciding to flush and close full files (per title of PR), then can't we deduce that if the number of snaps in file == max number of snaps then close based on that?

That's true for storage type "NumSnaps". But with type "one_month" (say), we can't say if the file is full unless we know the time stamp of the next write.

So, at a given time step, we know (can calculate) its time stamp, right? Then, why can't we deduce if one_month or one_day is ending here? Do we not know dt?

I understand the logic can be too convoluted, but it is still doable, no?

We don't have to do it now, but trying to understand if it is doable at all (ignoring the fact that we may choose not to make the code super ugly for some corner case)

We cannot compute the next time stamp during t0 output. The driver does not have info about dt during the init sequence, which is when the OM is setup. When I originally designed the driver, I wanted to separate concerns as much as possible. In my mind, dt was a "run" time param, not an "init" time param (I didn't even know if dt could in principle change dynamically down the road).

If you want to compute next_write_ts during t0 output (which, again, happens during the init sequence), we need to pass dt to the driver init methods (from the f90 cpl interface). We can of course do that. And all in all, it may make the code simpler. It's a slightly deeper interface change though, so we could do it as a follow up PR.

mahf708 · 2024-10-15T15:32:38Z

components/eamxx/src/share/io/scream_output_manager.cpp

+    // In case REST_OPT=nsteps, don't count t0 output as one of those steps
+    // NOTE: for m_output_control, it doesn't matter, since it'll be reset to 0 before we return


Hmmmmm.... I see where things get weird! I wonder if shifting the indexing altogether can help?

Tbc, this small issue is unrelated from closing the file at the right time. To be honest, I think we could flat out rm the line that updates nsamples_since_last_write for checkpoint control: it is not used anyways! And since, as the comment states, for output control it doesn't matter, we may as well remove this if block altogether...

mahf708 · 2024-10-15T15:33:22Z

components/eamxx/src/share/io/scream_output_manager.hpp

@@ -118,6 +118,10 @@ class OutputManager
  void finalize();

  long long res_dep_memory_footprint () const;
+
+  // For debug and testing purposes


Mind elaborating what debug and testing we mean here?

Well, I want to be able to verify the correctness of the control/filespecs structs during unit tests. The comment was meant to say "this is not really needed at runtime".

components/eamxx/src/share/tests/eamxx_time_interpolation_tests.cpp

E3SM-Autotester · 2024-10-15T16:28:16Z

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Build Num: 6157
Status: PASSED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;AT: RETEST;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Build Num: 5913
Status: PASSED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;AT: RETEST;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

E3SM-Autotester · 2024-10-15T16:28:34Z

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
THE LAST COMMIT TO THIS PULL REQUEST HAS BEEN REVIEWED, BUT NOT ACCEPTED OR REQUIRES CHANGES!

E3SM-Autotester · 2024-10-15T16:28:41Z

All Jobs Finished; status = PASSED, target_sha=0422f5754bde808b99f738c9dca1af4379634fbe, However Inspection must be performed before merge can occur...

E3SM-Autotester · 2024-10-15T16:35:14Z

The base branch has been updated since the last successful testing.

last PASS base branch sha: 0422f57
current base branch sha : 75ef2ed
The AutoTester will discard the last PASS, and re-test the PR from scratch

E3SM-Autotester · 2024-10-15T16:38:28Z

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Build Num: 6158
Status: STARTED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Build Num: 5914
Status: STARTED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)

Pull Request Author: AaronDonahue

E3SM-Autotester · 2024-10-15T17:53:43Z

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Build Num: 6158
Status: PASSED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Build Num: 5914
Status: PASSED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

E3SM-Autotester · 2024-10-15T17:54:07Z

All Jobs Finished; status = PASSED, target_sha=75ef2edca2837f1f61601a83fba2b559b26df85f, However Inspection must be performed before merge can occur...

E3SM-Autotester · 2024-10-16T18:11:23Z

All Jobs Finished; status = PASSED, target_sha=75ef2edca2837f1f61601a83fba2b559b26df85f, However Inspection must be performed before merge can occur...

E3SM-Autotester · 2024-10-16T20:52:30Z

The base branch has been updated since the last successful testing.

last PASS base branch sha: 75ef2ed
current base branch sha : 10fd3d0
The AutoTester will discard the last PASS, and re-test the PR from scratch

E3SM-Autotester · 2024-10-16T20:53:44Z

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Build Num: 6171
Status: STARTED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Build Num: 5923
Status: STARTED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)

Pull Request Author: AaronDonahue

E3SM-Autotester · 2024-10-16T21:09:41Z

Status Flag 'Pull Request AutoTester' - Error: Jenkins Jobs - A user has pushed a change to the PR before testing completed. NEW EVENT 'committed', ID C_kwDOCEfuetoAKDAzM2QzMWUxMjg5OGNlN2YwZjdiYjA1NzMyZWEyMzNiYjliOGNhYzM... The Jenkins Jobs will be shutdown; Testing of this PR must occur again.

E3SM-Autotester · 2024-10-16T21:10:16Z

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Build Num: 6171
Status: ERROR

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Build Num: 5923
Status: ERROR

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`c99a7d9`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

SCREAM_PullRequest_Autotester_Weaver # 6171 ERROR (click to see last 100 lines of console output)


PYTHON_BIN=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/bin;
export PYTHON_BIN;
PYTHON_INC=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/include;
export PYTHON_INC;
PYTHON_LIB=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/lib;
export PYTHON_LIB;
PYTHON_ROOT=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe;
export PYTHON_ROOT;
PYTHON_VERSION=3.10.8;
export PYTHON_VERSION;
_LMFILES_=/projects/ppc64le-pwr9-rhel8/modulefiles/lmod/utilities/linux-rhel8-ppc64le/Core/python/3.10.8.lua;
export _LMFILES_;
_ModuleTable001_=X01vZHVsZVRhYmxlXyA9IHsKTVR2ZXJzaW9uID0gMywKY19yZWJ1aWxkVGltZSA9IGZhbHNlLApjX3Nob3J0VGltZSA9IGZhbHNlLApkZXB0aFQgPSB7fSwKZmFtaWx5ID0ge30sCm1UID0gewpweXRob24gPSB7CmZuID0gIi9wcm9qZWN0cy9wcGM2NGxlLXB3cjktcmhlbDgvbW9kdWxlZmlsZXMvbG1vZC91dGlsaXRpZXMvbGludXgtcmhlbDgtcHBjNjRsZS9Db3JlL3B5dGhvbi8zLjEwLjgubHVhIiwKZnVsbE5hbWUgPSAicHl0aG9uLzMuMTAuOCIsCmxvYWRPcmRlciA9IDEsCnByb3BUID0ge30sCnN0YWNrRGVwdGggPSAwLApzdGF0dXMgPSAiYWN0aXZlIiwKdXNlck5hbWUgPSAicHl0aG9uLzMuMTAuOCIsCndWID0gIjAwMDAwMDAwMy4wMDAwMDAwMTAuMDAwMDAwMDA4Lip6;
export _ModuleTable001_;
_ModuleTable002_=ZmluYWwiLAp9LAp9LAptcGF0aEEgPSB7CiIvcHJvamVjdHMvcHBjNjRsZS1wd3I5LXJoZWw4L21vZHVsZWZpbGVzL2xtb2QvY29tcGlsZXJzIiwgIi9wcm9qZWN0cy9wcGM2NGxlLXB3cjktcmhlbDgvbW9kdWxlZmlsZXMvbG1vZC91dGlsaXRpZXMvbGludXgtcmhlbDgtcHBjNjRsZS9Db3JlIiwKfSwKc3lzdGVtQmFzZU1QQVRIID0gIi9wcm9qZWN0cy9wcGM2NGxlLXB3cjktcmhlbDgvbW9kdWxlZmlsZXMvbG1vZC9jb21waWxlcnM6L3Byb2plY3RzL3BwYzY0bGUtcHdyOS1yaGVsOC9tb2R1bGVmaWxlcy9sbW9kL3V0aWxpdGllcy9saW51eC1yaGVsOC1wcGM2NGxlL0NvcmUiLAp9Cg==;
export _ModuleTable002_;
_ModuleTable_Sz_=2;
export _ModuleTable_Sz_;'
+++ __LMOD_REF_COUNT_CMAKE_PREFIX_PATH=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe:1
+++ export __LMOD_REF_COUNT_CMAKE_PREFIX_PATH
+++ CMAKE_PREFIX_PATH=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe
+++ export CMAKE_PREFIX_PATH
+++ __LMOD_REF_COUNT_LD_LIBRARY_PATH='/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/lib:1;/opt/lsf/10.1/linux3.10-glibc2.17-ppc64le/lib:1'
+++ export __LMOD_REF_COUNT_LD_LIBRARY_PATH
+++ LD_LIBRARY_PATH=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/lib:/opt/lsf/10.1/linux3.10-glibc2.17-ppc64le/lib
+++ export LD_LIBRARY_PATH
+++ __LMOD_REF_COUNT_LIBRARY_PATH=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/lib:1
+++ export __LMOD_REF_COUNT_LIBRARY_PATH
+++ LIBRARY_PATH=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/lib
+++ export LIBRARY_PATH
+++ LOADEDMODULES=python/3.10.8
+++ export LOADEDMODULES
+++ __LMOD_REF_COUNT_MANPATH='/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/share/man:1;/opt/lsf/10.1/man:1'
+++ export __LMOD_REF_COUNT_MANPATH
+++ MANPATH=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/share/man:/opt/lsf/10.1/man::
+++ export MANPATH
+++ __LMOD_REF_COUNT_MODULEPATH='/projects/ppc64le-pwr9-rhel8/modulefiles/lmod/compilers:1;/projects/ppc64le-pwr9-rhel8/modulefiles/lmod/utilities/linux-rhel8-ppc64le/Core:1'
+++ export __LMOD_REF_COUNT_MODULEPATH
+++ MODULEPATH=/projects/ppc64le-pwr9-rhel8/modulefiles/lmod/compilers:/projects/ppc64le-pwr9-rhel8/modulefiles/lmod/utilities/linux-rhel8-ppc64le/Core
+++ export MODULEPATH
+++ __LMOD_REF_COUNT_PATH='/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/bin:1;/opt/lsf/10.1/linux3.10-glibc2.17-ppc64le/etc:1;/opt/lsf/10.1/linux3.10-glibc2.17-ppc64le/bin:1;/usr/local/bin:1;/usr/bin:1;/usr/local/sbin:1;/usr/sbin:1;/home/e3sm-jenkins/.local/bin:1;/home/e3sm-jenkins/bin:1'
+++ export __LMOD_REF_COUNT_PATH
+++ PATH=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/bin:/opt/lsf/10.1/linux3.10-glibc2.17-ppc64le/etc:/opt/lsf/10.1/linux3.10-glibc2.17-ppc64le/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/e3sm-jenkins/.local/bin:/home/e3sm-jenkins/bin
+++ export PATH
+++ __LMOD_REF_COUNT_PKG_CONFIG_PATH=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/lib/pkgconfig:1
+++ export __LMOD_REF_COUNT_PKG_CONFIG_PATH
+++ PKG_CONFIG_PATH=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/lib/pkgconfig
+++ export PKG_CONFIG_PATH
+++ PYTHON_BIN=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/bin
+++ export PYTHON_BIN
+++ PYTHON_INC=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/include
+++ export PYTHON_INC
+++ PYTHON_LIB=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe/lib
+++ export PYTHON_LIB
+++ PYTHON_ROOT=/projects/ppc64le-pwr9-rhel8/utilities/python/3.10.8/gcc/8.3.1/base/qmix2fe
+++ export PYTHON_ROOT
+++ PYTHON_VERSION=3.10.8
+++ export PYTHON_VERSION
+++ _LMFILES_=/projects/ppc64le-pwr9-rhel8/modulefiles/lmod/utilities/linux-rhel8-ppc64le/Core/python/3.10.8.lua
+++ export _LMFILES_
+++ _ModuleTable001_=X01vZHVsZVRhYmxlXyA9IHsKTVR2ZXJzaW9uID0gMywKY19yZWJ1aWxkVGltZSA9IGZhbHNlLApjX3Nob3J0VGltZSA9IGZhbHNlLApkZXB0aFQgPSB7fSwKZmFtaWx5ID0ge30sCm1UID0gewpweXRob24gPSB7CmZuID0gIi9wcm9qZWN0cy9wcGM2NGxlLXB3cjktcmhlbDgvbW9kdWxlZmlsZXMvbG1vZC91dGlsaXRpZXMvbGludXgtcmhlbDgtcHBjNjRsZS9Db3JlL3B5dGhvbi8zLjEwLjgubHVhIiwKZnVsbE5hbWUgPSAicHl0aG9uLzMuMTAuOCIsCmxvYWRPcmRlciA9IDEsCnByb3BUID0ge30sCnN0YWNrRGVwdGggPSAwLApzdGF0dXMgPSAiYWN0aXZlIiwKdXNlck5hbWUgPSAicHl0aG9uLzMuMTAuOCIsCndWID0gIjAwMDAwMDAwMy4wMDAwMDAwMTAuMDAwMDAwMDA4Lip6
+++ export _ModuleTable001_
+++ _ModuleTable002_=ZmluYWwiLAp9LAp9LAptcGF0aEEgPSB7CiIvcHJvamVjdHMvcHBjNjRsZS1wd3I5LXJoZWw4L21vZHVsZWZpbGVzL2xtb2QvY29tcGlsZXJzIiwgIi9wcm9qZWN0cy9wcGM2NGxlLXB3cjktcmhlbDgvbW9kdWxlZmlsZXMvbG1vZC91dGlsaXRpZXMvbGludXgtcmhlbDgtcHBjNjRsZS9Db3JlIiwKfSwKc3lzdGVtQmFzZU1QQVRIID0gIi9wcm9qZWN0cy9wcGM2NGxlLXB3cjktcmhlbDgvbW9kdWxlZmlsZXMvbG1vZC9jb21waWxlcnM6L3Byb2plY3RzL3BwYzY0bGUtcHdyOS1yaGVsOC9tb2R1bGVmaWxlcy9sbW9kL3V0aWxpdGllcy9saW51eC1yaGVsOC1wcGM2NGxlL0NvcmUiLAp9Cg==
+++ export _ModuleTable002_
+++ _ModuleTable_Sz_=2
+++ export _ModuleTable_Sz_
++ SCREAM_MACHINE=weaver
+ [[ 0 == 1 ]]
+ [[ 0 == 1 ]]
+ [[ 0 == 1 ]]
++ whoami
+ [[ e3sm-jenkins == \e\3\s\m\-\j\e\n\k\i\n\s ]]
+ git config --local user.email [email protected]
+ git config --local user.name 'Jenkins Jenkins'
+ declare -i fails=0
+ BASELINES_DIR=AUTO
+ TAS_ARGS='--baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m $machine'
+ [[ weaver == \p\m\-\g\p\u ]]
+ set +e
+ '[' -n 3032 ']'
+ is_at_run=1
+ SA_FAILURES_DETAILS=
+ '[' 1 -eq 1 ']'
++ ./scripts/gather-all-data './scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m $machine' -l -m weaver
***Forced exclusive execution
<>
<>
Build was aborted
Aborted by Luca Bertagna
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash -le
cd $WORKSPACE/${BUILD_ID}/
./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh

[SCREAM_PullRequest_Autotester_Weaver] $ /bin/bash -le /tmp/jenkins7076630350333083630.sh

./scream/components/eamxx/scripts/jenkins/jenkins_common.sh: line 7: 905779 Broken pipe             $JENKINS_SCRIPT_DIR/jenkins_common_impl.sh 2>&1

905780 Terminated              | tee JENKINS_$DATE_STAMP

SCREAM_PullRequest_Autotester_Mappy # 5923 ERROR (click to see last 100 lines of console output)


   prescribed_wind: no
************** General run info **********************
ncols: 218

nlevs: 72

npacks: 5

league_size: 218

team_size: 1

concurrent teams: 1

P3_INIT (reading/creating look-up tables) ...

Using memory pool. Initial size: 3.90625GB ;  Grow size: 3.90625GB.

INFORM: Automatically inserting fence() after every parallel_for

[EAMxx] initialize_atm_procs ... done!

[EAMxx::init] resolution-dependent device memory footprint: 60.849512MB

[EAMxx] initialize_output_managers ...

[EAMxx::output_manager] - Writing model-output:

[EAMxx::output_manager]      FILE: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep.INSTANT.nsteps_x4.np1.2021-10-12-45000.nc

[EAMxx::scorpio_output] Writing variables to file

file name: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep.INSTANT.nsteps_x4.np1.2021-10-12-45000.nc

Done! Elapsed time: 0.000000 seconds

[EAMxx::scorpio_output] Writing variables to file

file name: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep.INSTANT.nsteps_x4.np1.2021-10-12-45000.nc

Done! Elapsed time: 0.012000 seconds

[EAMxx::output_manager] - New Output stream

Filename prefix: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep

Run t0: 2021-10-12-45000

Case t0: 2021-10-12-45000

Reference t0: 2021-10-12-45000

Is Restart File ?: NO

Is Restarted Run ?: NO

Averaging Type: INSTANT

Output Frequency: 4 nsteps

File Capacity: 1snapshots

Includes Grid Data ?: YES

[EAMxx] initialize_output_managers ... done!

Start time stepping loop...       [  0%]

Atmosphere step = 0

model start-of-step time = 2021-10-12 12:30:00
WARNING: Failed and repaired post-condition property check.


Atmosphere process name: homme


Property check name: tracers lower bound check: 0


Atmosphere process MPI Rank: 0


Message: Check failed.


check name: tracers lower bound check: 0


field id: tracers[Physics GLL] double:ncol,dim,lev(218,41,72) [1]


minimum:

value: -3.13818e-53
indices (w/ global column index): (54,13,38)
lat/lon: (-31.9482, 77.5623)



maximum:

value: 1.05706e+10
indices (w/ global column index): (47,20,70)
lat/lon: (44.3197, 12.4377)



Iteration   1 completed       [ 25%]

Atmosphere step = 1

model start-of-step time = 2021-10-12 13:00:00


WARNING: Failed and repaired post-condition property check.


Atmosphere process name: homme


Property check name: tracers lower bound check: 0


Atmosphere process MPI Rank: 0


Message: Check failed.


check name: tracers lower bound check: 0


field id: tracers[Physics GLL] double:ncol,dim,lev(218,41,72) [1]


minimum:

value: -4.29349e-61
indices (w/ global column index): (60,13,34)
lat/lon: (0, 77.5623)



maximum:

value: 1.73157e+10
indices (w/ global column index): (47,20,70)
lat/lon: (44.3197, 12.4377)



Iteration   2 completed       [ 50%]

Atmosphere step = 2

model start-of-step time = 2021-10-12 13:30:00


srun: forcing job termination

slurmstepd: error: *** STEP 197802.0 ON localhost CANCELLED AT 2024-10-16T15:10:03 ***

srun: Job step aborted: Waiting up to 32 seconds for job step to finish.

srun: error: localhost: task 0: Killed
    Start  41: output_restart_check_AVERAGE_np4

251/550 Test  #37: output_restart_check_INSTANT_np4 ......................................   Passed    0.04 sec

Start 454: p3_mam4_wetscav_np3_vs_np1

252/550 Test  #41: output_restart_check_AVERAGE_np4 ......................................   Passed    0.04 sec

Start 461: shoc_cldfrac_p3_wetscav_np3_vs_np1'

Terminated

POST BUILD TASK : SUCCESS

END OF POST BUILD TASK : 0

Finished: ABORTED

mahf708

🎉

The merge-base changed after approval.

I contributed to the PR, so I won't be a reviewer anymore

mahf708

by order of the peaky blinders

The merge-base changed after approval.

bartgol · 2024-10-16T23:10:08Z

I'm not sure what gh is doing with this weird dismissal of the review. If testing passes, we'll just merge manually.

E3SM-Autotester · 2024-10-17T02:58:42Z

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Build Num: 6174
Status: STARTED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`033d31e`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Build Num: 5926
Status: STARTED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`033d31e`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)

Pull Request Author: AaronDonahue

E3SM-Autotester · 2024-10-17T04:02:37Z

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Build Num: 6174
Status: PASSED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`033d31e`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Build Num: 5926
Status: PASSED

Jenkins Parameters

Parameter Name	Value
PR_LABELS	I/O;bugfix
PULLREQUESTNUM	3032
SCREAM_SOURCE_REPO	https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA	`033d31e`
SCREAM_TARGET_BRANCH	master
SCREAM_TARGET_REPO	https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA	`9b1b4c7`
TEST_REPO_ALIAS	SCREAM

E3SM-Autotester · 2024-10-17T04:02:56Z

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
THE LAST COMMIT TO THIS PULL REQUEST HAS BEEN REVIEWED, BUT NOT ACCEPTED OR REQUIRES CHANGES!

E3SM-Autotester · 2024-10-17T04:03:02Z

All Jobs Finished; status = PASSED, target_sha=41f563d7ec3e4e1727d25267796e9beac13ffb12, However Inspection must be performed before merge can occur...

mahf708

by order of the peaky blinders

The merge-base changed after approval.

AaronDonahue added 3 commits October 3, 2024 17:01

A quick change to recreate an aborted run mid simulation

463ffbc

Change the name of fail option to be self-descriptive

f1b3a4b

AaronDonahue added I/O bugfix labels Oct 7, 2024

AaronDonahue requested review from bartgol and mahf708 October 7, 2024 18:42

AaronDonahue added the AT: WIP Inform the autotester (AT) that the PR is a work in progress, and should not be tested label Oct 7, 2024

bartgol previously requested changes Oct 7, 2024

View reviewed changes

AaronDonahue added 2 commits October 7, 2024 15:17

use function snapshot_fits to check if file is full. Accomodates othe…

f9d8bc9

…r file storage types

remove a check to release file that is no longer needed

eabd70f

AaronDonahue removed the AT: WIP Inform the autotester (AT) that the PR is a work in progress, and should not be tested label Oct 7, 2024

bartgol added 2 commits October 10, 2024 20:44

EAMxx: remove unused param in unit tests inputs

488c789

EAMxx: fix closure of full files for storage type!=NumSnaps

c99a7d9

bartgol added the AT: RETEST Force the autotester (AT) to retest the PR label Oct 15, 2024

mahf708 reviewed Oct 15, 2024

View reviewed changes

E3SM-Autotester removed the AT: RETEST Force the autotester (AT) to retest the PR label Oct 15, 2024

EAMxx: make comment clearer in IO header

033d31e

mahf708 previously approved these changes Oct 16, 2024

View reviewed changes

bartgol requested a review from mahf708 October 16, 2024 22:35

mahf708 previously approved these changes Oct 16, 2024

View reviewed changes

mahf708 previously approved these changes Oct 17, 2024

View reviewed changes

bartgol merged commit 68935b3 into master Oct 17, 2024
5 of 6 checks passed

bartgol deleted the aarondonahue/close_output_when_full branch October 17, 2024 15:06

		// In case REST_OPT=nsteps, don't count t0 output as one of those steps
		// NOTE: for m_output_control, it doesn't matter, since it'll be reset to 0 before we return

Flush and close output files that are full #3032

Flush and close output files that are full #3032

Conversation

AaronDonahue commented Oct 7, 2024

AaronDonahue commented Oct 7, 2024 • edited Loading

github-actions bot commented Oct 7, 2024 • edited Loading

bartgol commented Oct 7, 2024

bartgol left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bartgol Oct 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bartgol commented Oct 7, 2024

AaronDonahue commented Oct 7, 2024

AaronDonahue commented Oct 7, 2024

bartgol commented Oct 8, 2024

E3SM-Autotester commented Oct 8, 2024

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Jenkins Parameters

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Jenkins Parameters

Using Repos:

E3SM-Autotester commented Oct 8, 2024

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Jenkins Parameters

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Jenkins Parameters

=============================================================================== Testing '''49f82de844eac302dc95b7489a657e3174301205''' for test '''full_sp_debug'''

Testing '''49f82de844eac302dc95b7489a657e3174301205''' for test '''release'''

Testing '''49f82de844eac302dc95b7489a657e3174301205''' for test '''full_debug'''

We're having issues with some test-launcher job hanging forever. So let's make sure we clean all penting test-launcher jobs

AaronDonahue commented Oct 8, 2024

AaronDonahue commented Oct 8, 2024

E3SM-Autotester commented Oct 11, 2024

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Jenkins Parameters

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Jenkins Parameters

Using Repos:

E3SM-Autotester commented Oct 11, 2024

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Jenkins Parameters

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Jenkins Parameters

Total Test time (real) = 816.28 sec

Testing '''296cfb1368a106ccc6f0084ca29f00cb68ae5fa1''' for test '''full_sp_debug'''

Testing '''296cfb1368a106ccc6f0084ca29f00cb68ae5fa1''' for test '''full_debug'''

Testing '''296cfb1368a106ccc6f0084ca29f00cb68ae5fa1''' for test '''release'''

We're having issues with some test-launcher job hanging forever. So let's make sure we clean all penting test-launcher jobs

E3SM-Autotester commented Oct 15, 2024

E3SM-Autotester commented Oct 15, 2024

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

Jenkins Parameters

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

Jenkins Parameters

Using Repos:

mahf708 left a comment

Choose a reason for hiding this comment

AaronDonahue commented Oct 7, 2024 •

edited

Loading

github-actions bot commented Oct 7, 2024 •

edited

Loading

bartgol Oct 7, 2024 •

edited

Loading

===============================================================================
Testing '''49f82de844eac302dc95b7489a657e3174301205''' for test '''full_sp_debug'''