Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible non-closing of a file #251

Open
andrew-platt opened this issue Jun 22, 2023 · 11 comments
Open

Possible non-closing of a file #251

andrew-platt opened this issue Jun 22, 2023 · 11 comments

Comments

@andrew-platt
Copy link

andrew-platt commented Jun 22, 2023

When using the ROSCO controller with OpenFAST in AMR-Wind, we ran into an issue where ROSCO ran out of unit numbers when checkpoint files were written at every timestep. This was noticed by @shenpai35

Details:

  • OpenFAST 3.5.0
  • ROSCO 2.8.0
  • Model: IEA15
  • Unsure of AMR-Wind version

observations

  • When ROSCO's GetNewUnit() subroutine has a limit of 99, the GetNewUnit returns an error when trying to write checkpoint file 89.
  • When ROSCO's GetNewUnit() subroutine has a limit of 1024, the GetNewUnit returns an error when trying to write checkpoint file 1014.
  • If we switch models so that ROSCO is not used, we don't run into this issue. The NREL 5MW model runs with the above combination of OpenFAST, ROSCO, and AMR-Wind, without issue with writing the checkpoint files throughout the entire simulation (4k+ checkpoint files from the 5MW controller and OpenFAST).

Error:

ROSCO:PitchControl:PitchSaturation:interp1d:interp1d:interp1d:interp1d:interp1d:VariableSpeedControl:WindSpeedEstimator:AeroDynTorque:interp1d:interp1d:interp1d:GetNewUnit() was unable to find an open file unit specifier between 10 and 99.

conclusion
Considering the above observations, I find it most likely that ROSCO is not closing some file.

  1. Maybe ROSCO isn't closing a checkpoint file (this was our first guess). I really don't understand how that could be given the code here: https://github.com/NREL/ROSCO/blob/v2.8.0/ROSCO/src/ROSCO_IO.f90#L31C1-L37. The only thing I can conclude is that somehow it returns before hitting the close on line 275. This might be possible if the OPEN command returns non-zero for ErrStat, but actually opens the file. However that would result in nothing getting written to the file, but it appears most files have something in them. This might be a red herring.
  2. Given the error above, there is another file getting opened every timestep, but not getting closed. Maybe something in Interp1D? I lean towards this possibility, but have not really looked into it (I'll take a quick look and see if anything is obvious and report if I find anything).

Also one minor thing: the ErrMsg from line 35 of ROSCO_IO.f90 never gets returned.

@dzalkind dzalkind mentioned this issue Jul 6, 2023
14 tasks
@dzalkind
Copy link
Collaborator

dzalkind commented Jul 6, 2023

So, I borrowed this and added it here, but from what I understand, you have tried that, and it doesn't fix the issue?

@dzalkind
Copy link
Collaborator

dzalkind commented Jul 6, 2023

Would you happen to have a non-AMRWind setup that I can test? The error messaging is certainly not what it should be, and it makes me wonder if there are other errors stopping ROSCO and preventing files from being closed. Also, it looks like the debug outputs are never closed. I would set LoggingLevel to 0 as a workaround.

@andrew-platt
Copy link
Author

Increasing the maximum unit number only had the effect of postponing the unit number issue.

Unfortunately I don't have a non-AMRWind setup to test with. Perhaps it would be possible to run the IEA15 with OpenFAST and ROSCO and write out a checkpoint file every timestep (I'm not convinced this would actually show the problem though).

As you mention, maybe the debug outputs are the issue here (I don't know what LoggingLevel was used). Would a new debug file be opened after a checkpoint, or only once per simulation?

Coincidentally, this issue only appeared due to a bug whereby AMRWind requested checkpoints from OpenFAST at every timestep (we found a workaround for it). So it is somewhat unlikely that this issue would be a problem for most other use cases.

@dzalkind
Copy link
Collaborator

Hi @andrew-platt,

I finally got around to looking at this, and as you suspected, I haven't been able to reproduce the error using only OpenFAST and ROSCO. The unit number associated with the checkpoint file does not increase during the simulation.

Given the long error message, I wonder if ROSCO cannot kill the simulation properly which leads to files not being closed.

If there's a way to quickly run the AMRWind set up causing this issue, I could look into it more closely.

Best, Dan

@abhineet-gupta
Copy link
Collaborator

Hi @andrew-platt,
I am looking deeper into this issue. Would it be possible to share the files replicating this.
I am surprised that you were able to get OpenFAST 3.5.0 working with AMR-Wind (Exawind/amr-wind#850). I have tried to build it and I see the same error as this.

Best,
Abhineet

@andrew-platt
Copy link
Author

Hi @abhineet-gupta,

Unfortunately I don't have the input files for this AMR-Wind simulation (this was done by an intern during the summer). One other issue we discovered when using AMR-Wind was that it was requesting checkpoint files at every timestep. By changing the input file for AMR-Wind to only request checkpoint files every 99999999 steps, we were able to prevent the checkpoints from getting written and not trigger this issue.

Andy

@momemo1996
Copy link

Hi!
I am running SOWFA coupled with OpenFAST (BeamDyn) and ROSCO as controller and I get the same issue. @andrew-platt where is the input set in your simulation and do you know if it is somewhere in .fst file for IEA 15MW?
\Memo

@andrew-platt
Copy link
Author

andrew-platt commented Nov 9, 2023

Hi @momemo1996,

The input for setting the checkpoint I was referring to is in the AMR-Wind input file (for AMR-Wind coupled to OpenFAST). For SOWFA coupled to OpenFAST, I'm not entirely certain where the checkpoint request is. Perhaps @mchurchf can give some guidance on how to set checkpoints with SOWFA?

It's also possible that what you are seeing is a different issue. Which version of SOWFA and OpenFAST are you using?

Regards,

@momemo1996
Copy link

Hi @andrew-platt
I am getting the exact same error as described in the thread. I found that it is in constant/turbineArrayProporties file. I will report here if your solution can solve the issue but still strange.
Regards,
Memo

@dzalkind
Copy link
Collaborator

This issue keeps popping up when using AMR Wind and writing many checkpoint files.

The current workaround is to reduce the number of checkpoint files that are being written.

I have a hunch that the checkpoint file is not being written or read completely here, and there is no error catching in this part of the code to stop ROSCO and throw a meaningful error.

We still don't have an easy way to reproduce this error, so if anyone can easily share that, it would be greatly appreciated.

@momemo1996
Copy link

@andrew-platt
I managed to solve this by increasing the number for which checkpoint files are written in the settings to a very high value so no checkpoint files are written! It is in turbineArrayProperties in constant folder.
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants