Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add the amsua_n19_qc_bc.yaml into templates #206

Merged
merged 9 commits into from
Oct 25, 2024
Merged

add the amsua_n19_qc_bc.yaml into templates #206

merged 9 commits into from
Oct 25, 2024

Conversation

xyzemc
Copy link
Contributor

@xyzemc xyzemc commented Oct 17, 2024

This PR is for commit the work have been done in issue#180

The ctest has been passed with this new yaml file:

ctest -R rrfs_mpasjedi_2024052700_Ens3Dvar
Test project /scratch2/NCEPDEV/fv3-cam/Xiaoyan.Zhang/noscrub/JEDI/RDASApp_09272024/build/rrfs-test
Start 3: rrfs_mpasjedi_2024052700_Ens3Dvar
1/1 Test #3: rrfs_mpasjedi_2024052700_Ens3Dvar ... Passed 122.34 sec

100% tests passed, 0 tests failed out of 1

Label Time Summary:
mpi = 122.34 secproc (1 test)
rdas-bundle = 122.34 sec
proc (1 test)
script = 122.34 sec*proc (1 test)

Total Test time (real) = 123.18 sec

@SamuelDegelia-NOAA
Copy link
Contributor

@xyzemc I added a new wiki page detailing additional changes to be made when updating the ctests for a new observation type. Please see here: https://github.com/NOAA-EMC/RDASApp/wiki/Adding-new-observation-types-for-the-RDASApp-ctests

As part of your PR, could you also add the replaceable strings to the yaml, update/run gen_yaml_ctest.sh, and regenerate the test reference data? Please let me know if you have any issues with this - I would like adding new observation types to the ctests to be easy but this might be an iterative process to make it so.

@SamuelDegelia-NOAA
Copy link
Contributor

Another idea would be to let users add their individual yamls files and then we could do a later PR to update the ctests for those new yamls. That could be easier such that only a code manager would need to run/update the gen_yaml_ctest.sh script. We could also wait until we have a few new yaml types added to update the ctests with. Any thoughts on this @guoqing-noaa or @delippi?

@delippi
Copy link
Collaborator

delippi commented Oct 17, 2024

Another idea would be to let users add their individual yamls files and then we could do a later PR to update the ctests for those new yamls. That could be easier such that only a code manager would need to run/update the gen_yaml_ctest.sh script. We could also wait until we have a few new yaml types added to update the ctests with. Any thoughts on this @guoqing-noaa or @delippi?

@SamuelDegelia-NOAA, it seems okay to me to do it this way.

Copy link
Collaborator

@delippi delippi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xyzemc, I just have a few questions:

  1. It seems like you only have MPAS-JEDI results. Have you or do you plan to compare results in FV3-JEDI vs GSI? Both hofx validation where you provide FV3-JEDI with GSI-IODA (this is what I refer to as Phase 1) and making sure all the qc parts work as expected without using GSI-IODA (this is what I refer to as Phase 2). Either way, this is great work and we can definitely add this to the "validated_yamls" even without those tests done yet--since almost all of those yamls are still a work in progress.
  2. Just a minor detail. The file name of this yaml is amsua_n19_qc_bc.yaml. Would it make sense to remove the _qc_bc part and match the name of the obs space in the yaml?
  3. Another minor consistency detail. For conventional obs, we have been using the naming convention for the ioda data as like ioda_msonet.nc--so use something like ioda_amsua_n19.nc. That shouldn't be too hard to change here, but of course you would have to change the name in the converter step too.
  4. Is the hofx file supposed to have rass_ prefixed to it?

@SamuelDegelia-NOAA
Copy link
Contributor

@xyzemc You can hold off on what I discussed in my last message - we can just plan to update the ctests at a later time. So it should be okay to just add your yaml as is once @delippi's questions are addressed.

@xyzemc
Copy link
Contributor Author

xyzemc commented Oct 18, 2024

@xyzemc You can hold off on what I discussed in my last message - we can just plan to update the ctests at a later time. So it should be okay to just add your yaml as is once @delippi's questions are addressed.

@SamuelDegelia-NOAA Sounds good to me. Could you please put the data on role account first. The data location on my account is: /scratch2/NCEPDEV/fv3-cam/Xiaoyan.Zhang/noscrub/JEDI/RDASApp_09272024/expr/mpas_2024052700/data/obs/amsua_n19_obs

@SamuelDegelia-NOAA
Copy link
Contributor

@SamuelDegelia-NOAA Sounds good to me. Could you please put the data on role account first. The data location on my account is: /scratch2/NCEPDEV/fv3-cam/Xiaoyan.Zhang/noscrub/JEDI/RDASApp_09272024/expr/mpas_2024052700/data/obs/amsua_n19_obs

Done!

@xyzemc
Copy link
Contributor Author

xyzemc commented Oct 18, 2024

@xyzemc, I just have a few questions:

  1. It seems like you only have MPAS-JEDI results. Have you or do you plan to compare results in FV3-JEDI vs GSI? Both hofx validation where you provide FV3-JEDI with GSI-IODA (this is what I refer to as Phase 1) and making sure all the qc parts work as expected without using GSI-IODA (this is what I refer to as Phase 2). Either way, this is great work and we can definitely add this to the "validated_yamls" even without those tests done yet--since almost all of those yamls are still a work in progress.
  2. Just a minor detail. The file name of this yaml is amsua_n19_qc_bc.yaml. Would it make sense to remove the _qc_bc part and match the name of the obs space in the yaml?
  3. Another minor consistency detail. For conventional obs, we have been using the naming convention for the ioda data as like ioda_msonet.nc--so use something like ioda_amsua_n19.nc. That shouldn't be too hard to change here, but of course you would have to change the name in the converter step too.
  4. Is the hofx file supposed to have rass_ prefixed to it?

@delippi Answers for each question are:

  1. I haven't compared FV3-JEDI with GSI yet for amsua and atms. But will do that in the near future.
  2. I have change the yaml file name as you suggested.
  3. obs name changed to ioda_amsua_n19.nc
  4. I think 'rass_' prefix is what I copied from GDASApp yaml file. But I can definitely remove the prefix.

@xyzemc
Copy link
Contributor Author

xyzemc commented Oct 18, 2024

@xyzemc You can hold off on what I discussed in my last message - we can just plan to update the ctests at a later time. So it should be okay to just add your yaml as is once @delippi's questions are addressed.

I have done everything that guided in the wiki and addressed @delippi 's question with modification.
The new ctest results with the super yaml file is:

`ctest -R rrfs_mpasjedi_2024052700_Ens3Dvar
Test project /scratch2/NCEPDEV/fv3-cam/Xiaoyan.Zhang/noscrub/JEDI/RDASApp_09272024/build/rrfs-test
Start 3: rrfs_mpasjedi_2024052700_Ens3Dvar
1/1 Test #3: rrfs_mpasjedi_2024052700_Ens3Dvar ... Passed 337.52 sec

100% tests passed, 0 tests failed out of 1

Label Time Summary:
mpi = 337.52 secproc (1 test)
rdas-bundle = 337.52 sec
proc (1 test)
script = 337.52 sec*proc (1 test)

Total Test time (real) = 338.51 sec
`

@delippi
Copy link
Collaborator

delippi commented Oct 18, 2024

@xyzemc, I just have a few questions:

  1. It seems like you only have MPAS-JEDI results. Have you or do you plan to compare results in FV3-JEDI vs GSI? Both hofx validation where you provide FV3-JEDI with GSI-IODA (this is what I refer to as Phase 1) and making sure all the qc parts work as expected without using GSI-IODA (this is what I refer to as Phase 2). Either way, this is great work and we can definitely add this to the "validated_yamls" even without those tests done yet--since almost all of those yamls are still a work in progress.
  2. Just a minor detail. The file name of this yaml is amsua_n19_qc_bc.yaml. Would it make sense to remove the _qc_bc part and match the name of the obs space in the yaml?
  3. Another minor consistency detail. For conventional obs, we have been using the naming convention for the ioda data as like ioda_msonet.nc--so use something like ioda_amsua_n19.nc. That shouldn't be too hard to change here, but of course you would have to change the name in the converter step too.
  4. Is the hofx file supposed to have rass_ prefixed to it?

@delippi Answers for each question are:

  1. I haven't compared FV3-JEDI with GSI yet for amsua and atms. But will do that in the near future.
  2. I have change the yaml file name as you suggested.
  3. obs name changed to ioda_amsua_n19.nc
  4. I think 'rass_' prefix is what I copied from GDASApp yaml file. But I can definitely remove the prefix.

@xyzemc Thanks for taking those minor comments into consideration! It looks like you forgot to remove the amsua_n19_qc_bc.yaml in this PR. There are two yamls in the "Files Changed" tab.

Sorry that I also just noticed the same minor consistency issues with the atms. I should've caught those before. You don't necessarily have to make those changes in this PR. I assume you will have to make some modifications when you do FV3-JEDI vs GSI testing. You can make them there. I just wanted to let you know so you can keep that in mind.

One last thing. I noticed you have a few paths that start with data/obs_ctest. I just wonder if those should be something else? I'm just thinking about when we get to a cycling system, do we want that data stored in that particular directory name? It looks like you had it named differently recently too. I think how you had it before seems reasonable:

       obs bias:
         input file: data/obs/amsua_n19.satbias.nc4
         output file: data/bc/out_amsua_n19.satbias.nc

@xyzemc
Copy link
Contributor Author

xyzemc commented Oct 18, 2024

@xyzemc, I just have a few questions:

  1. It seems like you only have MPAS-JEDI results. Have you or do you plan to compare results in FV3-JEDI vs GSI? Both hofx validation where you provide FV3-JEDI with GSI-IODA (this is what I refer to as Phase 1) and making sure all the qc parts work as expected without using GSI-IODA (this is what I refer to as Phase 2). Either way, this is great work and we can definitely add this to the "validated_yamls" even without those tests done yet--since almost all of those yamls are still a work in progress.
  2. Just a minor detail. The file name of this yaml is amsua_n19_qc_bc.yaml. Would it make sense to remove the _qc_bc part and match the name of the obs space in the yaml?
  3. Another minor consistency detail. For conventional obs, we have been using the naming convention for the ioda data as like ioda_msonet.nc--so use something like ioda_amsua_n19.nc. That shouldn't be too hard to change here, but of course you would have to change the name in the converter step too.
  4. Is the hofx file supposed to have rass_ prefixed to it?

@delippi Answers for each question are:

  1. I haven't compared FV3-JEDI with GSI yet for amsua and atms. But will do that in the near future.
  2. I have change the yaml file name as you suggested.
  3. obs name changed to ioda_amsua_n19.nc
  4. I think 'rass_' prefix is what I copied from GDASApp yaml file. But I can definitely remove the prefix.

@xyzemc Thanks for taking those minor comments into consideration! It looks like you forgot to remove the amsua_n19_qc_bc.yaml in this PR. There are two yamls in the "Files Changed" tab.

Sorry that I also just noticed the same minor consistency issues with the atms. I should've caught those before. You don't necessarily have to make those changes in this PR. I assume you will have to make some modifications when you do FV3-JEDI vs GSI testing. You can make them there. I just wanted to let you know so you can keep that in mind.

One last thing. I noticed you have a few paths that start with data/obs_ctest. I just wonder if those should be something else? I'm just thinking about when we get to a cycling system, do we want that data stored in that particular directory name? It looks like you had it named differently recently too. I think how you had it before seems reasonable:

       obs bias:
         input file: data/obs/amsua_n19.satbias.nc4
        ` output file: data/bc/out_amsua_n19.satbias.nc`

@delippi the 'obs_ctest' is copied from atms yaml file that should be setup by @SamuelDegelia-NOAA . But I agree with you
input file: data/obs/amsua_n19.satbias.nc4
is more reasonable.
For the output file: data/bc/out_amsua_n19.satbias.nc
I modified to ' output file: out_amsua_n19.satbias.nc`

removed the data/bc to keep the output in the same place as hofx

@xyzemc
Copy link
Contributor Author

xyzemc commented Oct 18, 2024

@delippi For the ATMS yaml file, I have noticed the difference from amsua. I plan to change that later because I am working on atms_n20 yaml file right now. When I open PR to merge atms_n20 yaml file, I will modify the atms_npp yaml as well to keep consistency.

@SamuelDegelia-NOAA
Copy link
Contributor

When creating the new ctest structure, I split up the observation files in RDAS_DATA to obs/ and obs_ctest/ directories in case some of the obs we use for the ctests were different from those we were using before. Normally this was okay because only the gen_yaml_ctest.sh script referenced the obs_ctest/ directory to replace the @OBSFILE@ string in the yamls. But because the ATMS/AMSUA data use multiple input observation files (data, bias, lapse rate), some of those paths got hard coded into the yamls. I agree that this isn't an ideal solution given that want these yamls to be flexible and to be used eventually in the workflow.

I think keeping the yaml file general and just referencing an @OBSFILE@ string is ideal. So maybe we should also similar strings for @BIASFILE@ and @LAPSEFILE@, etc. to the yamls in this PR.

@delippi
Copy link
Collaborator

delippi commented Oct 18, 2024

When creating the new ctest structure, I split up the observation files in RDAS_DATA to obs/ and obs_ctest/ directories in case some of the obs we use for the ctests were different from those we were using before. Normally this was okay because only the gen_yaml_ctest.sh script referenced the obs_ctest/ directory to replace the @OBSFILE@ string in the yamls. But because the ATMS/AMSUA data use multiple input observation files (data, bias, lapse rate), some of those paths got hard coded into the yamls. I agree that this isn't an ideal solution given that want these yamls to be flexible and to be used eventually in the workflow.

I think keeping the yaml file general and just referencing an @OBSFILE@ string is ideal. So maybe we should also similar strings for @BIASFILE@ and @LAPSEFILE@, etc. to the yamls in this PR.

Sounds good. I just have one comment. We need to keep those gen_yaml scripts as simple as possible! I think you are, but I just needed to say it! hahaha

@SamuelDegelia-NOAA
Copy link
Contributor

I'm trying! But definitely open to any feedback to simplify it more.

@SamuelDegelia-NOAA
Copy link
Contributor

Sorry to keep changing things, but could we change all of the data/obs/... paths to a generic string that can be replaced? See below. This will help us be more generic in these validated_yaml files.

input file: data/obs/amsua_n19.satbias.nc4 -> input file: "@BIASFILE@"
tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt -> tlapse: "@LAPSEFILE@"
input file: data/obs/amsua_n19.satbias_cov.nc4 -> input file: "@COVFILE@"

Also, could you add an obs localizations section to the yamls, similar to the file here?

Once those are done then I think this PR will be good to go. We can update the ctests at a later time once we have more yamls ready to go.

@xyzemc
Copy link
Contributor Author

xyzemc commented Oct 18, 2024

Sorry to keep changing things, but could we change all of the data/obs/... paths to a generic string that can be replaced? See below. This will help us be more generic in these validated_yaml files.

input file: data/obs/amsua_n19.satbias.nc4 -> input file: "@BIASFILE@" tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt -> tlapse: "@LAPSEFILE@" input file: data/obs/amsua_n19.satbias_cov.nc4 -> input file: "@COVFILE@"

Also, could you add an obs localizations section to the yamls, similar to the file here?

Once those are done then I think this PR will be good to go. We can update the ctests at a later time once we have more yamls ready to go.

@SamuelDegelia-NOAA the variable &amsua_n19_tlapse in tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt will be used in the following lines. So I don't think tlapse: "@LAPSEFILE@" will work.

@delippi
Copy link
Collaborator

delippi commented Oct 18, 2024

Sorry to keep changing things, but could we change all of the data/obs/... paths to a generic string that can be replaced? See below. This will help us be more generic in these validated_yaml files.
input file: data/obs/amsua_n19.satbias.nc4 -> input file: "@BIASFILE@" tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt -> tlapse: "@LAPSEFILE@" input file: data/obs/amsua_n19.satbias_cov.nc4 -> input file: "@COVFILE@"
Also, could you add an obs localizations section to the yamls, similar to the file here?
Once those are done then I think this PR will be good to go. We can update the ctests at a later time once we have more yamls ready to go.

@SamuelDegelia-NOAA the variable &amsua_n19_tlapse in tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt will be used in the following lines. So I don't think tlapse: "@LAPSEFILE@" will work.

Wouldn't it just be
tlapse: &amsua_n19_tlapse "@LAPSEFILE@" or you could remove the anchor and specify everywhere. Either should work. It just requires running gen_yaml each time. Actually, that is how I do all my tests.

@xyzemc
Copy link
Contributor Author

xyzemc commented Oct 18, 2024

Sorry to keep changing things, but could we change all of the data/obs/... paths to a generic string that can be replaced? See below. This will help us be more generic in these validated_yaml files.
input file: data/obs/amsua_n19.satbias.nc4 -> input file: "@BIASFILE@" tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt -> tlapse: "@LAPSEFILE@" input file: data/obs/amsua_n19.satbias_cov.nc4 -> input file: "@COVFILE@"
Also, could you add an obs localizations section to the yamls, similar to the file here?
Once those are done then I think this PR will be good to go. We can update the ctests at a later time once we have more yamls ready to go.

@SamuelDegelia-NOAA the variable &amsua_n19_tlapse in tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt will be used in the following lines. So I don't think tlapse: "@LAPSEFILE@" will work.

Wouldn't it just be tlapse: &amsua_n19_tlapse "@LAPSEFILE@" or you could remove the anchor and specify everywhere. Either should work. It just requires running gen_yaml each time. Actually, that is how I do all my tests.

Ok, let me try tlapse: &amsua_n19_tlapse "@LAPSEFILE@"

Copy link
Contributor

@SamuelDegelia-NOAA SamuelDegelia-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xyzemc!

delippi
delippi previously approved these changes Oct 18, 2024
Copy link
Collaborator

@delippi delippi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@guoqing-noaa
Copy link
Collaborator

guoqing-noaa commented Oct 18, 2024

input file: data/obs/amsua_n19.satbias.nc4 -> input file: "@BIASFILE@"
tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt -> tlapse: "@LAPSEFILE@"
input file: data/obs/amsua_n19.satbias_cov.nc4 -> input file: "@COVFILE@"

@SamuelDegelia-NOAA Could you elaborate why you would like to have these changes?
I think it is because we now have obs and obs_ctest?

If so, let's remove obs, rename obs_ctest to obs and add the single ADPUPA obs to obs. It is also expected to combine that standalone single ob test into testinput (and hence testinput_expr will not be needed in the future).

I don't think we need that extra flexibility to specify biasfile, lapsefile, covfile on the fly. Sorry that I've been busy preparing for tomorrow's travel and did not join the discussion early.

@SamuelDegelia-NOAA
Copy link
Contributor

SamuelDegelia-NOAA commented Oct 21, 2024

@guoqing-noaa I would like to keep the yamls added to obtype_config/ as generic as possible. That way we can apply these yamls and associated tools for the workflow at a later date. As such, I think we should limit hardcoding file names in there.

The gen_yaml_ctest.sh can then be the tool we use to take these generic yamls and merge/convert them for use in the ctests. This choice makes it easier IMO since only that script needs to contain case-specific file names and paths.

@guoqing-noaa
Copy link
Collaborator

@guoqing-noaa I would like to keep the yamls added to obtype_config/ as generic as possible. That way we can apply these yamls and associated tools for the workflow at a later date. As such, I think we should limit hardcoding file names in there.

The gen_yaml_ctest.sh can then be the tool we use to take these generic yamls and merge/convert them for use in the ctests. This choice makes it easier IMO since only that script needs to contain case-specific file names and paths.

I have an alternate view. For those files, it would be the best to keep the "run_time" substitutions as less as possible.

Regarding the rrfs-workflow, we will use the giant XML file directly (it will be generated from RDASApp offline).

@SamuelDegelia-NOAA
Copy link
Contributor

I have an alternate view. For those files, it would be the best to keep the "run_time" substitutions as less as possible.

Regarding the rrfs-workflow, we will use the giant XML file directly (it will be generated from RDASApp offline).

Even during the workflow, we will need at least two runtime substitutions for the yaml: analysis time and observation distribution (for GETKF observer vs. solver). There might be more that I can't think of.

To me, it makes sense to also make the file names generic and replaceable for the time being until we have settled on standardized file names and input directories. But either solution is fine with me and I don't think it should hold up this PR.

@guoqing-noaa
Copy link
Collaborator

Even during the workflow, we will need at least two runtime substitutions for the yaml: analysis time and observation distribution (for GETKF observer vs. solver). There might be more that I can't think of.

Right, we will need to substitute the analysis time, the obs distribution and maybe a limited sets of others as needed in the future. But we would like to keep the substitutions as minimum as possible to keep the workflow as simple as possible.

To me, it makes sense to also make the file names generic and replaceable for the time being until we have settled on standardized file names and input directories. But either solution is fine with me and I don't think it should hold up this PR.

I did not hold up this PR. But I do think it is better to get things sorted out as early as possible.

@SamuelDegelia-NOAA
Copy link
Contributor

I did not hold up this PR. But I do think it is better to get things sorted out as early as possible.

I am fine with changing those lines back if that's what you want.

@guoqing-noaa
Copy link
Collaborator

I did not hold up this PR. But I do think it is better to get things sorted out as early as possible.

I am fine with changing those lines back if that's what you want.

That will be helpful. Thanks, @SamuelDegelia-NOAA

@xyzemc
Copy link
Contributor Author

xyzemc commented Oct 23, 2024

I have replaced the three input pathes and names from 'string' to the actual 'data/obs/filename'. Please re-check and approve it. Thanks!

@HaidaoLin-NOAA
Copy link

This is a general comment to the group and we can address/discuss this later, don't have to be in this PR. Should we put all radiance bias correction related files, under data/obs/ or put them on a separate path/folder, such as data/satbias/. This is not very important and either way would be fine, but I would prefer to have a separate folder for satbias related files and not put them all under data/obs. Any thoughts/comments? Thanks!

@hu5970 hu5970 merged commit 380ad1a into develop Oct 25, 2024
3 checks passed
@hu5970 hu5970 deleted the feature/amsua branch October 25, 2024 05:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants