Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mrcfile 1.5.3 version incompatibilty #76

Open
MJoosten opened this issue Jan 20, 2025 · 3 comments
Open

mrcfile 1.5.3 version incompatibilty #76

MJoosten opened this issue Jan 20, 2025 · 3 comments

Comments

@MJoosten
Copy link

We found that the latest version of mrcfile (1.5.3) causes the io of mrcfiles in Parakeet to break as a result of memory allocation of the file. See the error log. The same call to parakeet.export works fine when using mrcfile 1.5.0. We use parakeet v 1.5.9

Simulation started at: Mon Jan 20 09:54:33 CET 2025
Reading data from /home/tnw-nb4020-01/data/simulation_results/beta_galactocidase/updated_parakeet/test_op3/000000_image.h5
Writing data to /home/tnw-nb4020-01/data/simulation_results/beta_galactocidase/updated_parakeet/test_op3/000000.mrc
Traceback (most recent call last):
File "/home/tnw-nb4020-01/miniconda3/envs/roodmus_dev/bin/parakeet.export", line 8, in
sys.exit(export())
^^^^^^^^
File "/home/tnw-nb4020-01/parakeet/src/parakeet/command_line/_export.py", line 415, in export
export_impl(get_parser().parse_args(args=args))
File "/home/tnw-nb4020-01/parakeet/src/parakeet/command_line/_export.py", line 326, in export_impl
writer = parakeet.io.new(
^^^^^^^^^^^^^^^^
File "/home/tnw-nb4020-01/parakeet/src/parakeet/io.py", line 1145, in new
return MrcFileWriter(filename, shape, pixel_size, dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tnw-nb4020-01/parakeet/src/parakeet/io.py", line 609, in init
self.handle = mrcfile.new_mmap(
^^^^^^^^^^^^^^^^^
File "/home/tnw-nb4020-01/miniconda3/envs/roodmus_dev/lib/python3.11/site-packages/mrcfile/load_functions.py", line 339, in new_mmap
mrc.set_extended_header(extended_header)
File "/home/tnw-nb4020-01/miniconda3/envs/roodmus_dev/lib/python3.11/site-packages/mrcfile/mrcmemmap.py", line 72, in set_extended_header
self._open_memmap(data_copy.dtype, data_copy.shape)
File "/home/tnw-nb4020-01/miniconda3/envs/roodmus_dev/lib/python3.11/site-packages/mrcfile/mrcmemmap.py", line 133, in _open_memmap
raise ex
File "/home/tnw-nb4020-01/miniconda3/envs/roodmus_dev/lib/python3.11/site-packages/mrcfile/mrcmemmap.py", line 123, in _open_memmap
self._data = np.memmap(self._iostream,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tnw-nb4020-01/miniconda3/envs/roodmus_dev/lib/python3.11/site-packages/numpy/core/memmap.py", line 267, in new
mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: mmap offset is greater than file size
Simulation ended at: Mon Jan 20 10:23:47 CET 2025

@jmp1985
Copy link
Collaborator

jmp1985 commented Jan 21, 2025

Hi @MJoosten

I also see this behaviour. It seems to be something to do with the extended header. This script is broken:

import mrcfile
import numpy as np

FEI_EXTENDED_HEADER_DTYPE = mrcfile.dtypes.get_ext_header_dtype(b"FEI1")

filename = "test.mrc"
shape = (20, 4096, 4096)
dtype = np.dtype("float32")

extended_header = np.zeros(shape=shape[0], dtype=FEI_EXTENDED_HEADER_DTYPE)
extended_header["Metadata size"] = extended_header.dtype.itemsize

handle = mrcfile.new_mmap(
    filename,
    shape=shape,
    mrc_mode=mrcfile.utils.mode_from_dtype(dtype),
    overwrite=True,
    extended_header=extended_header,
    exttyp=b"FEI1",
)

print("Writing")
for j in range(shape[0]):
    print(j)
    handle.data[j] = np.random.uniform(0, 1, size=shape[1:])

But this script is not:

import mrcfile
import numpy as np

filename = "test.mrc"
shape = (20, 4096, 4096)
dtype = np.dtype("float32")

handle = mrcfile.new_mmap(
    filename,
    shape=shape,
    mrc_mode=mrcfile.utils.mode_from_dtype(dtype),
    overwrite=True,
)

print("Writing")
for j in range(shape[0]):
    print(j)
    handle.data[j] = np.random.uniform(0, 1, size=shape[1:])

@MJoosten
Copy link
Author

Yes after looking at the trace more carefully it seems to be a problem with mrcfile rather than anything in Parakeet. In hindsight I probably should have realised that and just reported it to Colin instead. I haven't tested it, but I'm guessing it may have to do with newer versions of numpy

@jmp1985
Copy link
Collaborator

jmp1985 commented Jan 22, 2025

So Colin had a look at this and apparently it is an issue with specific versions of numpy and mrcfile. See here for a more in depth explanation (ccpem/mrcfile#65).

The upshot is that if you have numpy v2.2.2 installed then it should be fine, otherwise you will either need to use the master branch version of mrcfile or wait for colin to publish a new release.

I've also put numpy>=v2.2.2 in the list of dependencies for parakeet.

Let me know if updating numpy works for you and I'll close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants