You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue:
The current DICOM-to-NIfTI conversion process is serialized, making it highly time-consuming for large datasets. This bottleneck significantly hampers inference workflows on large-scale data.
Proposed Solution:
Implement parallel processing for DICOM-to-NIfTI conversion to fully utilize all available CPU cores. The recommended options are:
Dask: Leverage its parallel computing capabilities.
Multiprocessing: Use Python's multiprocessing library for fine-grained control.
Challenges with Dask Multiprocessing:
Attempting to use Dask's processes scheduler resulted in the following error:
TypeError: cannot pickle '_thread.RLock' object
This error occurs because Dask's multiprocessing scheduler requires objects to be serializable, and some components (e.g., output_manager) cannot be pickled due to underlying locks (_thread.RLock). We can skip the output_manager, but it is needed for the beautiful console updates. And dask's progress bar is not that great to look at.
Next Steps:
Either stick with Dask's default threads scheduler for I/O-bound tasks, avoiding serialization issues.
Or carefully refactor the code to use Python's multiprocessing for CPU-bound workloads, ensuring all shared objects are serializable.
The text was updated successfully, but these errors were encountered:
Issue:
The current DICOM-to-NIfTI conversion process is serialized, making it highly time-consuming for large datasets. This bottleneck significantly hampers inference workflows on large-scale data.
Proposed Solution:
Implement parallel processing for DICOM-to-NIfTI conversion to fully utilize all available CPU cores. The recommended options are:
Challenges with Dask Multiprocessing:
Attempting to use Dask's
processes
scheduler resulted in the following error:This error occurs because Dask's multiprocessing scheduler requires objects to be serializable, and some components (e.g.,
output_manager
) cannot be pickled due to underlying locks (_thread.RLock
). We can skip theoutput_manager
, but it is needed for the beautiful console updates. Anddask's
progress bar is not that great to look at.Next Steps:
threads
scheduler for I/O-bound tasks, avoiding serialization issues.multiprocessing
for CPU-bound workloads, ensuring all shared objects are serializable.The text was updated successfully, but these errors were encountered: