Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove python3 incompatibility #2

Open
wants to merge 43 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
9d39857
Manifest PEP8 cleanup
marshallward Jan 16, 2019
f816287
Git subcommand functions removed
marshallward Jan 17, 2019
881e6b7
Merge pull request #152 from marshallward/pep8
marshallward Jan 18, 2019
259bd30
Moved runlog commit to before model run
aidanheerdegen Jan 15, 2019
a5aeef6
Added job information. Output is to an yaml file.
aidanheerdegen Jan 17, 2019
599682b
Removed superfluous git commands.
aidanheerdegen Jan 18, 2019
9d6afb4
Forgot to add `__init__.py` for new scheduler directory. Fails with
aidanheerdegen Jan 18, 2019
e6059d4
Forgot to save when fixing merge conflicts
aidanheerdegen Jan 18, 2019
db35313
Reduced indentation for single continued lines
aidanheerdegen Jan 20, 2019
130ac77
Added git commit hash as a property (run_id) to Experiment
aidanheerdegen Jan 21, 2019
6b04569
Removed subprocess alias in runlog.py
aidanheerdegen Jan 21, 2019
cecc7fe
Add job_id to job info dump from PBS scheduler
aidanheerdegen Jan 21, 2019
900d77d
Review revisions:
aidanheerdegen Jan 22, 2019
16fef83
Fixed incorrect import
aidanheerdegen Jan 22, 2019
78a33b4
Added missing import to experiment.py
aidanheerdegen Jan 22, 2019
76d03ea
Merge pull request #150 from aidanheerdegen/dumpstats
marshallward Jan 22, 2019
21185ba
Parse MITgcm config files to check they are correct namelists before
aidanheerdegen Jan 22, 2019
93a7e23
PEP8 cleanup
aidanheerdegen Jan 22, 2019
220a0eb
Merge pull request #153 from aidanheerdegen/mitgcm-148
marshallward Jan 22, 2019
5392c8f
Swap order of operations so when we hard sweep the archive directory
aidanheerdegen Jan 22, 2019
1b1ab5c
Minor PEP8 fix
aidanheerdegen Jan 22, 2019
ac5ca4b
Merge pull request #154 from aidanheerdegen/sweep-136
marshallward Jan 22, 2019
49db06f
Added control path to job yaml dump
aidanheerdegen Jan 22, 2019
2acf1c4
Merge pull request #155 from aidanheerdegen/add_ctrl
marshallward Jan 22, 2019
2b220b1
Ensure restart file is always written when not reproducing. (#156)
aidanheerdegen Jan 22, 2019
789cb7e
Add option to collate restart files as well as output files (#158)
aidanheerdegen Jan 24, 2019
8eb17b0
Forgot to prepend error path to logs when run crashes. (#159)
aidanheerdegen Jan 24, 2019
fe2a8ea
Added support for ability of cice5 to define separate input and resta…
aidanheerdegen Feb 5, 2019
661f2a6
Core dump support; PEP8 cleanup
marshallward Feb 6, 2019
db4ae4b
Typo....
marshallward Feb 6, 2019
a71a9e4
Improved Core dump support
marshallward Feb 6, 2019
bc453a4
Change MPI invocation to use full path to symlink in work
aidanheerdegen Feb 12, 2019
7ea252c
UM driver: added check restart dump created. Modified all runs to be
aidanheerdegen Feb 12, 2019
73920e2
init_path was not properly created. Fixed.
aidanheerdegen Feb 12, 2019
5a2f5f6
oasis driver: add logic to skip models without work path
aidanheerdegen Feb 12, 2019
a970968
cice: Added optional input_ice.nml for ACCESS-ESM
aidanheerdegen Feb 13, 2019
b56ea77
experiment: Updated to using check_manifest method (from old make_links)
aidanheerdegen Feb 13, 2019
266e9e1
manifest: add make_link method which is now called from add_filepath.
aidanheerdegen Feb 13, 2019
3261975
cice5: Moved common logic to cice driver
aidanheerdegen Feb 13, 2019
13fda3c
access: Made some of the cice5 specific logic also apply to cice.
aidanheerdegen Feb 13, 2019
517487b
Use normpath on filepaths in manifest keys to ensure consistency and
aidanheerdegen Feb 13, 2019
1347ddf
Enforced integer divide in int_to_date for python3 compatibility
aidanheerdegen Feb 13, 2019
fa84213
Changed division to integer for python3 compatibility
aidanheerdegen Feb 13, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 29 additions & 4 deletions docs/source/design.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,32 @@ adopted where possible.
In particular, ``help()`` should be readable and well-formatted for every
module and function.

3. Modules should not be renamed. This is bad::
3. Imports should be one per line (as in PEP8), and ideally alphabetical (as
recommended by PyLint). Additionally, we separate these into three groups
with a blank line, and in this order:

a. Future statements

b. Standard library modules

c. Dependencies

d. Modules local to the project

Example import::

from __future__ import print_function

import os
import shlex
import sys

import requests
import yaml

import payu.envmod

4. Modules should not be renamed. This is bad::

import numpy as np

Expand All @@ -78,13 +103,13 @@ adopted where possible.

(Also note that this is another rule with poor conformance.)

4. Multiple equivalence checks should use tuples. This is bad::
5. Multiple equivalence checks should use tuples. This is bad::

if x == 'a' or x == 'b'
if x == 'a' or x == 'b':

This is good::

if x in ('a', 'b')
if x in ('a', 'b'):

.. _`HHGP's section on modules`:
http://docs.python-guide.org/en/latest/writing/structure/#modules
4 changes: 2 additions & 2 deletions payu/calendar.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ def int_to_date(date):
Convert an int of form yyyymmdd to a python date object.
"""

year = date / 10**4
month = date % 10**4 / 10**2
year = date // 10**4
month = date % 10**4 // 10**2
day = date % 10**2

return datetime.date(year, month, day)
Expand Down
162 changes: 120 additions & 42 deletions payu/experiment.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from __future__ import print_function

# Standard Library
import datetime
import errno
import getpass
import os
Expand All @@ -20,9 +21,13 @@
import subprocess as sp
import sysconfig

# Extensions
import yaml

# Local
from payu import envmod
from payu.fsops import mkdir_p, make_symlink, read_config
from payu.scheduler.pbs import get_job_info, pbs_env_init, get_job_id
from payu.models import index as model_index
import payu.profilers
from payu.runlog import Runlog
Expand All @@ -43,6 +48,8 @@ class Experiment(object):
def __init__(self, lab, reproduce=False):
self.lab = lab

self.start_time = datetime.datetime.now()

# TODO: replace with dict, check versions via key-value pairs
self.modules = set()

Expand Down Expand Up @@ -103,6 +110,10 @@ def __init__(self, lab, reproduce=False):

self.payu_path = os.path.join(payu_bin, 'payu')

self.run_id = None

pbs_env_init()

def init_models(self):

self.model_name = self.config.get('model')
Expand Down Expand Up @@ -285,6 +296,14 @@ def set_expt_pathnames(self):
self.stdout_fname = self.lab.model_type + '.out'
self.stderr_fname = self.lab.model_type + '.err'

self.job_fname = 'job.yaml'
self.env_fname = 'env.yaml'

self.output_fnames = (self.stderr_fname,
self.stdout_fname,
self.job_fname,
self.env_fname)

def set_output_paths(self):

# Local archive paths
Expand Down Expand Up @@ -381,11 +400,11 @@ def setup(self, force_archive=False):
if len(self.models) > 1:
self.model.setup()

# Use manifest to populate work directory
self.manifest.make_links()
self.manifest.check_manifests()

# Copy manifests to work directory so they archived on completion
self.manifest.copy_manifests(os.path.join(self.work_path,'manifests'))
manifest_path = os.path.join(self.work_path, 'manifests')
self.manifest.copy_manifests(manifest_path)

setup_script = self.userscripts.get('setup')
if setup_script:
Expand Down Expand Up @@ -469,7 +488,10 @@ def run(self, *user_flags):
# Update MPI library module (if not explicitly set)
# TODO: Check for MPI library mismatch across multiple binaries
if mpi_module is None:
mpi_module = envmod.lib_update(model.exec_path_local, 'libmpi.so')
mpi_module = envmod.lib_update(
model.exec_path_local,
'libmpi.so'
)

model_prog = []

Expand Down Expand Up @@ -510,11 +532,11 @@ def run(self, *user_flags):
if prof.runscript:
model_prog = model_prog.append(prof.runscript)


model_prog.append(model.exec_prefix)

# Use the exec_name (without path) as this is now linked in work
model_prog.append(model.exec_name)
# Use the full path to symlinked exec_name in work as some
# older MPI libraries complained executable was not in PATH
model_prog.append(os.path.join(model.work_path,model.exec_name))

mpi_progs.append(' '.join(model_prog))

Expand All @@ -531,7 +553,9 @@ def run(self, *user_flags):
if self.expand_shell_vars:
cmd = os.path.expandvars(cmd)

print(cmd)
# TODO: Consider making this default
if self.config.get('coredump', False):
enable_core_dump()

# Our MVAPICH wrapper does not support working directories
if mpi_module.startswith('mvapich'):
Expand All @@ -540,8 +564,17 @@ def run(self, *user_flags):
else:
curdir = None

# Dump out environment
with open(self.env_fname, 'w') as file:
file.write(yaml.dump(dict(os.environ), default_flow_style=False))

self.runlog.create_manifest()
if self.runlog.enabled:
self.runlog.commit()

# NOTE: This may not be necessary, since env seems to be getting
# correctly updated. Need to look into this.
print(cmd)
if env:
# TODO: Replace with mpirun -x flag inputs
proc = sp.Popen(shlex.split(cmd), stdout=f_out, stderr=f_err,
Expand All @@ -555,13 +588,39 @@ def run(self, *user_flags):
if curdir:
os.chdir(curdir)

self.runlog.create_manifest()
if self.runlog.enabled:
self.runlog.commit()

f_out.close()
f_err.close()

self.finish_time = datetime.datetime.now()

info = get_job_info()

if info is None:
# Not being run under PBS, reverse engineer environment
info = {
'PAYU_PATH': os.path.dirname(self.payu_path)
}

# Add extra information to save to jobinfo
info.update(
{
'PAYU_CONTROL_DIR': self.control_path,
'PAYU_RUN_ID': self.run_id,
'PAYU_CURRENT_RUN': self.counter,
'PAYU_N_RUNS': self.n_runs,
'PAYU_JOB_STATUS': rc,
'PAYU_START_TIME': self.start_time.isoformat(),
'PAYU_FINISH_TIME': self.finish_time.isoformat(),
'PAYU_WALLTIME': "{0} s".format(
(self.finish_time - self.start_time).total_seconds()
),
}
)

# Dump job info
with open(self.job_fname, 'w') as file:
file.write(yaml.dump(info, default_flow_style=False))

# Remove any empty output files (e.g. logs)
for fname in os.listdir(self.work_path):
fpath = os.path.join(self.work_path, fname)
Expand All @@ -581,14 +640,19 @@ def run(self, *user_flags):
mkdir_p(error_log_dir)

# NOTE: This is PBS-specific
job_id = os.environ.get('PBS_JOBID', '')
job_id = get_job_id(short=False)

if job_id == '':
job_id = self.run_id[:6]

for fname in self.output_fnames:

for fname in (self.stdout_fname, self.stderr_fname):
src = os.path.join(self.control_path, fname)

# NOTE: This assumes standard .out/.err extensions
stem, suffix = os.path.splitext(fname)
dest = os.path.join(error_log_dir,
fname[:-4] + '.' + job_id + fname[-4:])
".".join((stem, job_id)) + suffix)

print(src, dest)

shutil.copyfile(src, dest)
Expand All @@ -611,7 +675,7 @@ def run(self, *user_flags):
self.n_runs -= 1

# Move logs to archive (or delete if empty)
for f in (self.stdout_fname, self.stderr_fname):
for f in self.output_fnames:
f_path = os.path.join(self.control_path, f)
if os.path.getsize(f_path) == 0:
os.remove(f_path)
Expand Down Expand Up @@ -854,29 +918,6 @@ def run_userscript(self, script_cmd):
def sweep(self, hard_sweep=False):
# TODO: Fix the IO race conditions!

if hard_sweep:
if os.path.isdir(self.archive_path):
print('Removing archive path {0}'.format(self.archive_path))
cmd = 'rm -rf {0}'.format(self.archive_path)
cmd = shlex.split(cmd)
rc = sp.call(cmd)
assert rc == 0

if os.path.islink(self.archive_sym_path):
print('Removing symlink {0}'.format(self.archive_sym_path))
os.remove(self.archive_sym_path)

if os.path.isdir(self.work_path):
print('Removing work path {0}'.format(self.work_path))
cmd = 'rm -rf {0}'.format(self.work_path)
cmd = shlex.split(cmd)
rc = sp.call(cmd)
assert rc == 0

if os.path.islink(self.work_sym_path):
print('Removing symlink {0}'.format(self.work_sym_path))
os.remove(self.work_sym_path)

# TODO: model outstreams and pbs logs need to be handled separately
default_job_name = os.path.basename(os.getcwd())
short_job_name = str(self.config.get('jobname', default_job_name))[:15]
Expand Down Expand Up @@ -907,7 +948,44 @@ def sweep(self, hard_sweep=False):
print('Moving log {0}'.format(f))
shutil.move(f, os.path.join(pbs_log_path, f))

# Remove stdout/err
for f in (self.stdout_fname, self.stderr_fname):
if hard_sweep:
if os.path.isdir(self.archive_path):
print('Removing archive path {0}'.format(self.archive_path))
cmd = 'rm -rf {0}'.format(self.archive_path)
cmd = shlex.split(cmd)
rc = sp.call(cmd)
assert rc == 0

if os.path.islink(self.archive_sym_path):
print('Removing symlink {0}'.format(self.archive_sym_path))
os.remove(self.archive_sym_path)

# Remove stdout/err and yaml dumps
for f in self.output_fnames:
if os.path.isfile(f):
os.remove(f)

if os.path.isdir(self.work_path):
print('Removing work path {0}'.format(self.work_path))
cmd = 'rm -rf {0}'.format(self.work_path)
cmd = shlex.split(cmd)
rc = sp.call(cmd)
assert rc == 0

if os.path.islink(self.work_sym_path):
print('Removing symlink {0}'.format(self.work_sym_path))
os.remove(self.work_sym_path)


def enable_core_dump():
# Newer Intel compilers support 'FOR_DUMP_CORE_FILE' while most support
# 'decfort_dump_flag'. Setting both for now, but there may be a more
# platform-independent way to support this.

# Enable Fortran core dump
os.environ['FOR_DUMP_CORE_FILE'] = 'TRUE'
os.environ['decfort_dump_flag'] = 'TRUE'

# Allow unlimited core dump file sizes
resource.setrlimit(resource.RLIMIT_CORE,
(resource.RLIM_INFINITY, resource.RLIM_INFINITY))
Loading