Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpointing from DRHDF5 is untested and can cause seg faults #1813

Open
chs-blueorigin opened this issue Dec 6, 2024 · 1 comment · May be fixed by #1837
Open

Checkpointing from DRHDF5 is untested and can cause seg faults #1813

chs-blueorigin opened this issue Dec 6, 2024 · 1 comment · May be fixed by #1837
Assignees

Comments

@chs-blueorigin
Copy link

chs-blueorigin commented Dec 6, 2024

Description

Running checkpointing from a trick.DRBinary will function where running checkpointing from atrick.DRHDF5 file will cause a seg fault. Currently, there are also no tests that use the trick.DRHDF5 format when running checkpointing tests. See the SIM_checkpoint_data_recording for more details as it seems to only use trick.DRAscii.

This is currently occurring in a RHEL8 environment.

Steps to Reproduce

  1. Create a small simulation and let it run in real-time. Enable the GUI. Configure an input.py file that looks something like this:
group = trick.DRBinary("<YOUR_NAME>")

# track a variable
group.add_variable("<SIMOBJ>.<YOUR_VAR>")

# finish configuring logging
group.freq = trick.DR_Always
group.set_cycle(0.01)
trick.add_data_record_group(group, trick.DR_Buffer)

# sim config
trick.frame_log_on()
trick.real_time_enable()
trick.exec_set_software_frame(0.01)
trick.itimer_enable()
trick.exec_set_enable_freeze(True)
trick.exec_set_freeze_command(True)
trick.sim_control_panel_set_enabled(True)

trick.stop(30.0)
  1. Run the simulation for a few seconds (Start).
  2. Freeze the simulation (Freeze).
  3. Save a checkpoint (Dump Chkpnt).
  4. Un-freeze the simulation for a few more seconds (Start).
  5. Freeze the simulation (Freeze).
  6. Load the checkpoint (Load Chkpnt).

This feature works as intended. Now, we'll do the same thing with a trick.DRHDF5 file.

  1. Modify your input.py file such that it saves in a trick.DRHDF5 format:
group = trick.DRHDF5("<YOUR_NAME>")

# track a variable
group.add_variable("<SIMOBJ>.<YOUR_VAR>")

# finish configuring logging
group.freq = trick.DR_Always
group.set_cycle(0.01)
trick.add_data_record_group(group, trick.DR_Buffer)

# sim config
trick.frame_log_on()
trick.real_time_enable()
trick.exec_set_software_frame(0.01)
trick.itimer_enable()
trick.exec_set_enable_freeze(True)
trick.exec_set_freeze_command(True)
trick.sim_control_panel_set_enabled(True)

trick.stop(30.0)
  1. Restart the simulation and wait for the GUI to pop up.
  2. Run the simulation for a few seconds (Start).
  3. Freeze the simulation (Freeze).
  4. Save a checkpoint (Dump Chkpnt).
  5. Un-freeze the simulation and let it run for a few more seconds (Start).
  6. Freeze the simulation (Freeze).
  7. Load the checkpoint (Load Chkpnt).

See if it can Start again, but check your terminal from where you launched the simulation. The error I see when this occurs is:

Process terminated by signal SIGSEGV

@sharmeye
Copy link
Contributor

Thanks for bringing this to our attention, we're working a solution and will update once it's been tested and committed. The issue will stay open until then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants