Casting memmaps in ArrayProxies is slower than loading into memory first (optimization opportunity) #1371

clane9 · 2024-10-07T21:19:41Z

It seems that reading uncompressed nifti volumes takes significantly longer when memory mapping is enabled. In this example, I load four large 4D nifti volumes from the ABCD dataset, with and without memory mapping.

import time

import nibabel as nib
import numpy as np

print(nib.__version__)


def benchmark_nibabel_load(path: str, mmap: bool):
    tic = time.monotonic()
    img = nib.load(path, mmap=mmap)
    data = np.asarray(img.get_fdata())
    rt = time.monotonic() - tic
    print(f"mmap: {mmap}, run time: {rt:.3f}s")


paths = [
    "/ABCD/sub-NDARINV0CCVJ39W/ses-2YearFollowUpYArm1/func/sub-NDARINV0CCVJ39W_ses-2YearFollowUpYArm1_task-rest_run-04_bold.nii",
    "/ABCD/sub-NDARINV0YE7L9KU/ses-2YearFollowUpYArm1/func/sub-NDARINV0YE7L9KU_ses-2YearFollowUpYArm1_task-rest_run-04_bold.nii",
    "/ABCD/sub-NDARINV0F82C6R8/ses-2YearFollowUpYArm1/func/sub-NDARINV0F82C6R8_ses-2YearFollowUpYArm1_task-rest_run-02_bold.nii",
    "/ABCD/sub-NDARINV0V80916L/ses-baselineYear1Arm1/func/sub-NDARINV0V80916L_ses-baselineYear1Arm1_task-rest_run-04_bold.nii",
]

mmaps = [True, False, True, False]

for path, mmap in zip(paths, mmaps):
    benchmark_nibabel_load(path, mmap=mmap)

This is what I get, using nibabel v5.2.1:

mmap: True, run time: 90.764s
mmap: False, run time: 3.595s
mmap: True, run time: 37.405s
mmap: False, run time: 4.810s

Any idea why this might happen?

The text was updated successfully, but these errors were encountered:

effigies · 2024-10-10T14:13:29Z

.get_fdata() casts to a floating point dtype, float64 by default. You may prefer just np.asanyarray(img.dataobj), which will ensure that scaling is applied but preserve the memmap if no scaling is necessary.

If you know the dtype you want to operate in, np.int16img.dataobj) (for example) would work. If you're using targeting float32, img.get_fdata(dtype='float32') will work and cache the result for the next call.

clane9 · 2024-10-15T14:06:29Z

Thanks @effigies, I think these are good strategies for interacting with the memmap data. But still I'm confused why the simple get_fdata() is so much slower when mmap=True (the default)?

Regardless, I will just set mmap=False for my case and avoid the issue, so it is not too critical to get to the bottom of.

effigies · 2024-10-15T14:31:40Z

I don't really know why the memmap is slower. I would probably profile it and see where it's spending the time.

clane9 · 2024-10-16T15:09:33Z

Ya I did a quick profiling, with mmap=True, it's spending almost all of the time here, doing casting it seems.

With mmap=False, the casting takes less time and most of the time is spent doing the read

I guess when the array is a memmap, astype() does an implicit read, which is slower for some reason than the explicit read when mmap=False.

effigies · 2024-10-16T15:23:43Z

I'm not sure there's much to be done here. That said, if you're up for it, you could try looking into and benchmarking other approaches to scaling. e.g., pre-allocating an array with np.empty(dtype=target_dtype) and assigning its values with np.copyto.

That will move more of the branching logic (e.g., are dtypes the same) into nibabel, and it has been nice to let numpy make those decisions and trust them to be efficient. It could be worth upstreaming the solution (if we find one) into numpy's memmap.astype() implementation.

clane9 · 2024-10-16T16:00:28Z

Ya I agree, I feel like there's not much worth doing at this point. Especially since in this case mmap=False is fine. Just wanted to make a note since it did confuse us for a little while before we tracked it down. Will let you know if we end up feeling motivated and decide to look for some improvement.

effigies changed the title ~~Slow reading of uncompressed nifti volumes when memory mapping is enabled~~ Casting memmaps in ArrayProxies is slower than loading into memory first (optimization opportunity) Oct 16, 2024

effigies added enhancement good-hackathon-project optimization and removed enhancement labels Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Casting memmaps in ArrayProxies is slower than loading into memory first (optimization opportunity) #1371

Casting memmaps in ArrayProxies is slower than loading into memory first (optimization opportunity) #1371

clane9 commented Oct 7, 2024

effigies commented Oct 10, 2024 •

edited

Loading

clane9 commented Oct 15, 2024

effigies commented Oct 15, 2024

clane9 commented Oct 16, 2024

effigies commented Oct 16, 2024

clane9 commented Oct 16, 2024

Casting memmaps in ArrayProxies is slower than loading into memory first (optimization opportunity) #1371

Casting memmaps in ArrayProxies is slower than loading into memory first (optimization opportunity) #1371

Comments

clane9 commented Oct 7, 2024

effigies commented Oct 10, 2024 • edited Loading

clane9 commented Oct 15, 2024

effigies commented Oct 15, 2024

clane9 commented Oct 16, 2024

effigies commented Oct 16, 2024

clane9 commented Oct 16, 2024

effigies commented Oct 10, 2024 •

edited

Loading