Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capgen in SCM: Differing source and module name #637

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

dustinswales
Copy link
Collaborator

[ 50 character, one line summary ]
Fortran does not require module names be the same as the source file in which they are defined.
This PR allows for this in Capgen

User interface changes?: No
Fixes: #636

Testing:
Expanded var_compatability_test

Copy link
Collaborator

@climbfuji climbfuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions, butI am not seeing any issues with these changes.

# DJS2024: First, try to find module_name from the metadata. Otherwise,
# use file name as module_name (default).
self.__module_name = find_module_name(self.__pobj.filename)
if (self.__module_name == ''):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this happen? We require all Fortran code to be in modules, schemes or not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata is only needed for the DDT, not the module where it lives.
The behavior now is that IF you provide a module table entry in the DDT metadata file it will use that instead.

@@ -223,6 +223,51 @@ def parse_metadata_file(filename, known_ddts, run_env, skip_ddt_check=False):

########################################################################

def find_module_name(filename):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised that this isn't already done by the parser. Is that because everything is based on the assumption filename=modulename?

Copy link
Collaborator Author

@dustinswales dustinswales Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Metadata tables for new DDTs default to the source filename (e.g. here). Not the best.
This is a problem for DDTs in two instances, a) DDTs housed in differing module/file names and b) DDTs in files with multiple modules.
We don't have any of the latter in the SCM/UFS, so I haven't gone down that rabbit hole.

@@ -657,7 +657,8 @@ def __init__(self, sdfs, host_model, scheme_headers, run_env):
self.__ddt_lib = DDTLibrary('{}_api'.format(self.host_model.name),
run_env, ddts=all_ddts)
for header in [d for d in scheme_headers if d.header_type != 'ddt']:
if header.header_type != 'scheme':
# DJS2024: Schemes and modules with DDTs?
if header.header_type != 'scheme' and header.header_type != 'module':
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@climbfuji Here's another thing I needed to add. Capgen wasn't happy with modules that held DDT definitions (e.g. radlw_param.f).
I convinced myself that this is not intentional and probably due to a lack of testing Capgen with respect to DDTs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gold2718 @peverwhee Fortran modules containing DDTs should be allowed, but I want to be sure this change doesn't have any unintentional consequences elsewhere?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dustinswales I can't think of any reason why this wouldn't be ok.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. The problem here is that you are using a host file (module_rad_ddt.meta) as a scheme file. The module table type is not (currently?) allowed in capgen. The error message could be improved (see below) but:

  • I do not see why the module header is needed or useful here
  • Defining a DDT used by both a scheme and the host makes the scheme inherently non-portable -- something the CCPP was created to avoid.

Copy link
Collaborator Author

@dustinswales dustinswales Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gold2718

  • I do not see why the module header is needed or useful here

There are schemes with native DDTs which the host needs to know about (e.g. PUMAS, RRTMGP). How should this be handled with Capgen?

  • Defining a DDT used by both a scheme and the host makes the scheme inherently non-portable -- something the CCPP was created to avoid.

Fair point, but we need to accommodate schemes that come to the party with their own DDTs.

Also, what about the other way, that is interstitial schemes that use host-defined data containers? (e.g. GFS_phys_reset.F90)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gold2718 The issue with SCM is that the metadata changes introduced in #646 won't work with the physics in Prebuild, which is something I want to avoid in this transition. Allowing a module header here would avoid that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that the changes in 646 were optional ("An optional module_name keyword is added to the ccpp-table-properties (MetadataTable) section that allows the user to specify a Fortran module name that is independent from the name of the enclosing file."). Are you saying they are not sufficient or break existing prebuild functionality? Are you trying to run the same physics with both prebuild and capgen? Maybe we can talk about this at our next meeting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optional part is fine. It's not allowing a module header in the metadata that will break Prebuild. (There is no mechanism in Prebuild to handle the different module/file names w/o using the module header)
Yeah, up to this point I was able to use the same physics in both Prebuild/Capgen.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optional part is fine. It's not allowing a module header in the metadata that will break Prebuild. (There is no mechanism in Prebuild to handle the different module/file names w/o using the module header) Yeah, up to this point I was able to use the same physics in both Prebuild/Capgen.

Is it necessary to open this (big IMHO) hole in capgen in order to allow the transition? Once the module header is allowed in capgen for schemes, anyone could use module metadata to completely circumvent at least one of the Kalnay rules (Rule 3: All communication with the package shall be through the argument list at the entry points). Since this is also a current CCPP rule for physics, I think it should warrant more discussion before adopting it just to work around an issue in prebuild.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gold2718 It's not necessary. I'm going to do some physics cleanup to follow the metadata section definitions you detailed in #648.

@@ -20,7 +20,7 @@ get_filename_component(CCPP_ROOT "${TEST_ROOT}" DIRECTORY)
#
#------------------------------------------------------------------------------
LIST(APPEND SCHEME_FILES "var_compatibility_files.txt")
LIST(APPEND HOST_FILES "test_host_data" "test_host_mod")
LIST(APPEND HOST_FILES "module_rad_ddt" "test_host_data" "test_host_mod")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gold2718 @peverwhee @climbfuji Design philosophy question below.

When a physics scheme has a native DDT that the host needs to be aware of, the file containing the DDT needs to be at the top of both the scheme_files and host_files list (e.g. host_files and scheme_files in CCPP SCM). In the SCM/UFS, we also do the inverse and use host defined DDTs within Interstitial physics schemes, so there is a need for a pool of DDTs available to both the host and the schemes.

Would it make sense to have a third class of files provided to Capgen, containing the DDT typedefs, that are processed before the scheme and host files?
I'm fine with having to define the host/scheme lists with the typedefs at the top, just curious to hear your thoughts on this? (I'm also not volunteering to do this!)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That third class would be a mix of scheme and host files. Something like this exists in prebuild, see for example https://github.com/NOAA-EMC/fv3atm/blob/2c902a670e89416ef49254c827e8ba45a68ce596/ccpp/config/ccpp_prebuild_config.py#L12

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for volunteering to do this, @dustinswales ! :)

I'm barely following what y'all are talking about. Would your proposed solution mean that no DDTs could be included within scheme or host files? They'd all have to be separate? Or would this additional class of files contain some duplicates vs the host and scheme files? Is that what @climbfuji is saying?

Copy link
Collaborator

@gold2718 gold2718 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this approach attacks the problem from the wrong angle. I think the solution is to add a module_name keyword to the ccpp-table-properties and remove the (unused so far) module keyword from the ccpp-arg-table sections. This solves several problems by allowing the user to specify the module name where the Fortran source can be found.

I have implemented an alternative approach which includes a slightly modified version of your test. I propose that this replace this PR.

@@ -657,7 +657,8 @@ def __init__(self, sdfs, host_model, scheme_headers, run_env):
self.__ddt_lib = DDTLibrary('{}_api'.format(self.host_model.name),
run_env, ddts=all_ddts)
for header in [d for d in scheme_headers if d.header_type != 'ddt']:
if header.header_type != 'scheme':
# DJS2024: Schemes and modules with DDTs?
if header.header_type != 'scheme' and header.header_type != 'module':
errmsg = "{} is an unknown CCPP API metadata header type, {}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
errmsg = "{} is an unknown CCPP API metadata header type, {}"
if header.header_type == 'module':
errmsg = f"{header.title} is a module metadata header type."
errmsg+=" This is not an allowed CCPP scheme header type."
else:
errmsg = f"{header.title} is an unknown CCPP API metadata header type, {header.header_type}"
# end if

@@ -657,7 +657,8 @@ def __init__(self, sdfs, host_model, scheme_headers, run_env):
self.__ddt_lib = DDTLibrary('{}_api'.format(self.host_model.name),
run_env, ddts=all_ddts)
for header in [d for d in scheme_headers if d.header_type != 'ddt']:
if header.header_type != 'scheme':
# DJS2024: Schemes and modules with DDTs?
if header.header_type != 'scheme' and header.header_type != 'module':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. The problem here is that you are using a host file (module_rad_ddt.meta) as a scheme file. The module table type is not (currently?) allowed in capgen. The error message could be improved (see below) but:

  • I do not see why the module header is needed or useful here
  • Defining a DDT used by both a scheme and the host makes the scheme inherently non-portable -- something the CCPP was created to avoid.

dustinswales added a commit to dustinswales/ccpp-framework that referenced this pull request Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants