MET throws segfaults with failed python embedding #2335
-
Hi all, I've found that if you use python embedding in METplus, and the python script returns an error instead of its desired results (for example if one of the packages you’re trying to import does not exist in your python environment), then it can cause MET to segfault and dump large core files. Obviously the solution here is to have the python script actually succeed, but perhaps you might want to add some error handling here to avoid the segfaults? I produced these segfaults on Jet, log file here: /home/role.amb-verif/AMDAR_PBLH_realtime_verif/verif/amdar_rt/HRRR_OPS/realtime/METplus/logs/metplus.log.20230829215619 The python script is definitely failing because the Jet admins seem to have removed netcdf4 from our conda environment. We're working on getting it back. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 4 replies
-
Hi @mollybsmith-noaa, thanks for logging this discussion. I did some testing locally and reviewed the log file you sent (thank you!). The use case you are running was added as part of the METplus 5.1 coordinated release, and requires functionality that was added in MET 11.1.0 but I notice that you are using METplus 5.0.1 and MET 11.0.1. If you have access to a 5.1 coordinated release install, can you try that? I confirmed locally that I do not receive a segfault using 5.1. Regarding the segfault, I have forwarded this info to an engineer to see if there is a more elegant way we can handle this case. I will update this discussion with any relevant issues that come out of that discussion. We've also added a note that all METplus use cases should list what version of MET/METplus they were developed with or for, and/or what version of MET/METplus is required to run the use case. Let me know if switching to 5.1 resolves the errors you are seeing. |
Beta Was this translation helpful? Give feedback.
-
Will do, those versions were only installed on Jet as of yesterday. Out of
curiosity, what new features in 5.1 are required? I know that the AMDAR
PBLH use case was only added in 5.1, but I thought that it was initially
developed with 5.0.1?
…On Thu, Aug 31, 2023 at 3:45 PM Dan Adriaansen ***@***.***> wrote:
Hi @mollybsmith-noaa <https://github.com/mollybsmith-noaa>, thanks for
logging this discussion.
I did some testing locally and reviewed the log file you sent (thank
you!). The use case you are running was added as part of the METplus 5.1
coordinated release, and requires functionality that was added in MET
11.1.0 but I notice that you are using METplus 5.0.1 and MET 11.0.1. If you
have access to a 5.1 coordinated release install, can you try that? I
confirmed locally that I do not receive a segfault using 5.1.
Regarding the segfault, I have forwarded this info to an engineer to see
if there is a more elegant way we can handle this case. I will update this
discussion with any relevant issues that come out of that discussion. We've
also added a note that all METplus use cases should list what version of
MET/METplus they were developed with or for, and/or what version of
MET/METplus is required to run the use case.
Let me know if switching to 5.1 resolves the errors you are seeing.
—
Reply to this email directly, view it on GitHub
<#2335 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHBWGZJA7YVTGZW4ULQYWQ3XYEAVPANCNFSM6AAAAAA4GVDAYY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
FYI, we've also been experiencing development issues with python embedding. All of the python embedding examples are segfaulting inside the Docker image we use for testing GitHub actions. I wanted to check whether Molly's experience and the development issues were related. But I was able to successfully run a basic example of python embedding on jet:
That command runs without error -- whereas it fails in our development GHA Docker environment.
So we are already catching some bad behavior... but apparently not all of it. |
Beta Was this translation helpful? Give feedback.
-
@DanielAdriaansen @JohnHalleyGotway Running with the newest versions of MET and METplus did fix the segfault! Thank you for suggesting that. They weren't available on Jet until yesterday, but I had thought that this use case was developed with the prior versions. Thanks again for the help! |
Beta Was this translation helpful? Give feedback.
Hi @mollybsmith-noaa, thanks for logging this discussion.
I did some testing locally and reviewed the log file you sent (thank you!). The use case you are running was added as part of the METplus 5.1 coordinated release, and requires functionality that was added in MET 11.1.0 but I notice that you are using METplus 5.0.1 and MET 11.0.1. If you have access to a 5.1 coordinated release install, can you try that? I confirmed locally that I do not receive a segfault using 5.1.
Regarding the segfault, I have forwarded this info to an engineer to see if there is a more elegant way we can handle this case. I will update this discussion with any relevant issues that come out of that discussion. W…