Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregate LaMEM output files & write them back to disk in compressed form #19

Open
boriskaus opened this issue Nov 15, 2023 · 0 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@boriskaus
Copy link
Member

boriskaus commented Nov 15, 2023

Currently LaMEM writes a lot of output files, as each processor writes their part of the grid to disk at every requested timestep. The files are in binary format but not compressed.

That is bad for most HPC systems, as their (typical) filesystems rather deal with fewer and larger files than many smaller ones. Changing this within LaMEM itself is cumbersome as it involves copying all files to rank 0. We have looked into using other fileformats such as HDF5 but i) installing that along with PETSc is (or was) tricky and ii) this still doesn't generate files that can directly be opened by Paraview, which is a must for a good user experience.

One potential workaround is by using the WriteVTK package. This package allows writing files to disk in compressed binary format, which saves space. In addition, LaMEM.jl already opens the many files for a given timestep and reads them into a single structure.

The idea is thus to write a routine that automatically processes all timesteps in a LaMEM simulation and:

  1. Reads a given timestep with all fields.
  2. Writes back that timestep as a single, compressed, file to the same (timestep) directory.
  3. Generate a new *.pvtr file in the same directory, that points to the single new file, rather than to the many smaller ones.
  4. Delete the smaller *.vtr files and rename the new *.pvtr file to have the same name as the old *.pvtr file.

As a result you should still be able to open the *.pvd file as before, but now it will have only one file per timestep rather than many smaller ones. This should load faster in paraview and will use less disk space.

By making this routine part of LaMEM.jl, you can postprocess your simulations once the simulation is done.

@boriskaus boriskaus added enhancement New feature or request help wanted Extra attention is needed labels Nov 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant