Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement convex_hull_image and convex_hull_object #828

Open
wants to merge 4 commits into
base: branch-25.04
Choose a base branch
from

Conversation

grlee77
Copy link
Contributor

@grlee77 grlee77 commented Feb 7, 2025

This MR implements a hybrid CPU + GPU implementation of convex_hull_image and convex_hull_object. These functions can be used standalone, but convex_hull_image is also used during computation of some regionprops:

  • area_convex
  • image_convex
  • feret_diameter_max
  • solidity

The implementation here still uses QHull via SciPy on the CPU, but all preprocessing computations as well as the post-processing of which image pixels fall within the convex hull polygon are done on the GPU. Fortunately, QHull was not the bottleneck, as seen in the benchmark results below.

The image below shows an example of a non-convex labeled region (yellow) with the convex region computed by convex_hull_image shown in red

convex_hull_image

Benchmark results

The last two columns labeled "accel" are the ratio of (cpu_duration / gpu_duration). "32" indicates the GPU case was run with an extra kwarg that allows 32-bit floating point when computing the final convex image mask from the convex hull coordinates. All durations are in seconds.

All except the smallest sizes tested show performance benefit relative to scikit-image. The relative performance benefit can be very large, especially for large images and in cases where the image is highly concave and if offset_coords=False as this reduces the run time of the QHull algorithm on the CPU.

shape im_type +offset coords dur. (CPU) dur. (GPU64) dur. (GPU32) accel. 64 accel. 32
(64, 64) skeleton False 0.00028 0.00040 0.00041 0.70 0.70
(64, 64) skeleton True 0.00042 0.00054 0.00054 0.78 0.78
(64, 64) convex False 0.00034 0.00046 0.00046 0.74 0.74
(64, 64) convex True 0.00052 0.00063 0.00063 0.82 0.82
(128, 128) skeleton False 0.00068 0.00083 0.00083 0.83 0.82
(128, 128) skeleton True 0.00097 0.00110 0.00111 0.88 0.87
(128, 128) convex False 0.00084 0.00096 0.00096 0.87 0.87
(128, 128) convex True 0.00115 0.00129 0.00130 0.89 0.88
(256, 256) skeleton False 0.00205 0.00074 0.00061 2.77 3.34
(256, 256) skeleton True 0.00272 0.00108 0.00097 2.51 2.79
(256, 256) convex False 0.00243 0.00082 0.00070 2.96 3.49
(256, 256) convex True 0.00308 0.00152 0.00141 2.03 2.19
(512, 512) skeleton False 0.00729 0.00084 0.00072 8.71 10.17
(512, 512) skeleton True 0.00908 0.00147 0.00135 6.18 6.71
(512, 512) convex False 0.00860 0.00101 0.00090 8.50 9.57
(512, 512) convex True 0.00978 0.00228 0.00221 4.30 4.43
(1024, 1024) skeleton False 0.02791 0.00110 0.00095 25.44 29.46
(1024, 1024) skeleton True 0.03306 0.00225 0.00212 14.67 15.58
(1024, 1024) convex False 0.03253 0.00143 0.00130 22.83 24.95
(1024, 1024) convex True 0.03481 0.00377 0.00364 9.23 9.55
(2048, 2048) skeleton False 0.10966 0.00165 0.00142 66.27 77.29
(2048, 2048) skeleton True 0.12874 0.00394 0.00367 32.71 35.04
(2048, 2048) convex False 0.12958 0.00218 0.00201 59.41 64.42
(2048, 2048) convex True 0.13364 0.00670 0.00656 19.95 20.36
(4096, 4096) skeleton False 0.44152 0.00328 0.00264 134.51 167.26
(4096, 4096) skeleton True 0.50531 0.00815 0.00727 62.00 69.47
(4096, 4096) convex False 0.51315 0.00411 0.00375 124.96 136.70
(4096, 4096) convex True 0.52371 0.01325 0.01293 39.54 40.51
(16, 16, 16) skeleton False 0.00038 0.00052 0.00051 0.72 0.74
(16, 16, 16) skeleton True 0.00074 0.00085 0.00084 0.88 0.88
(16, 16, 16) convex False 0.00081 0.00093 0.00093 0.88 0.87
(16, 16, 16) convex True 0.00099 0.00109 0.00111 0.91 0.89
(24, 24, 24) skeleton False 0.00073 0.00067 0.00057 1.08 1.28
(24, 24, 24) skeleton True 0.00148 0.00088 0.00077 1.68 1.92
(24, 24, 24) convex False 0.00190 0.00108 0.00096 1.76 1.98
(24, 24, 24) convex True 0.00211 0.00282 0.00282 0.75 0.75
(32, 32, 32) skeleton False 0.00168 0.00067 0.00057 2.50 2.95
(32, 32, 32) skeleton True 0.00364 0.00094 0.00081 3.85 4.51
(32, 32, 32) convex False 0.00475 0.00133 0.00117 3.56 4.04
(32, 32, 32) convex True 0.00469 0.00439 0.00431 1.07 1.09
(64, 64, 64) skeleton False 0.01603 0.00076 0.00060 21.09 26.57
(64, 64, 64) skeleton True 0.03546 0.00117 0.00100 30.19 35.57
(64, 64, 64) convex False 0.05167 0.00302 0.00284 17.11 18.17
(64, 64, 64) convex True 0.04797 0.01839 0.01826 2.61 2.63
(128, 128, 128) skeleton False 0.20372 0.00103 0.00076 198.54 267.39
(128, 128, 128) skeleton True 0.50452 0.00181 0.00147 279.08 342.96
(128, 128, 128) convex False 0.55845 0.00976 0.00954 57.24 58.52
(128, 128, 128) convex True 0.56635 0.08109 0.07915 6.98 7.16
(256, 256, 256) skeleton False 1.57213 0.00275 0.00154 571.76 1019.91
(256, 256, 256) skeleton True 3.50514 0.00478 0.00305 733.77 1149.93
(256, 256, 256) convex False 4.08911 0.03873 0.03705 105.58 110.38
(256, 256, 256) convex True 3.96437 0.36873 0.36824 10.75 10.77
benchmarking script
import cupy as cp
import numpy as np
from cupyx.profiler import benchmark
from skimage.morphology import convex_hull_image as convex_hull_image_cpu

from cucim.skimage.morphology import convex_hull_image


print("| shape | im_type | +offset coords | Dur. (CPU) | Dur. (GPU64) | Dur. (GPU32) | accel. 64 | accel. 32 |")
print("|-------|---------|----------------|------------|--------------|--------------|-----------|-----------|")

max_duration = 2
for shape in [(64, 64), (128, 128), (256, 256), (512, 512), (1024, 1024), (2048, 2048), (4096, 4096), (16, 16, 16), (24, 24, 24), (32, 32, 32), (64, 64, 64), (128, 128, 128), (256, 256, 256)]:
    for im_type in ["skeleton", "convex"]:
        for offset_coordinates in [False, True]:
            image = cp.zeros(shape, dtype=bool)
            ndim = len(shape)
            # generate a "plus sign" skeleton image (very non-convex)
            if ndim == 2:
                image[:, shape[1] // 2] = True
                image[shape[0] // 2, :] = True
            else:
                image[:, shape[1] // 2, shape[2] // 2] = True
                image[shape[0] // 2, shape[1] // 2, :] = True
                image[shape[0] // 2, :, shape[2] // 2] = True

            # use im_type to test image that is already convex (vs. highly non-convex if im_type="skeleton")
            if im_type == "convex":
                image = convex_hull_image(image)

            image_cpu = cp.asnumpy(image)

            kwargs=dict(offset_coordinates=offset_coordinates)
            perf_cpu = benchmark(convex_hull_image_cpu, (image_cpu,), kwargs=kwargs, n_warmup=1, max_duration=max_duration)
            duration_cpu = float(perf_cpu.gpu_times.mean())

            perf_gpu = benchmark(convex_hull_image, (image,), kwargs=kwargs, n_warmup=10, max_duration=max_duration)
            duration_gpu = float(perf_gpu.gpu_times.mean())

            kwargs=dict(offset_coordinates=offset_coordinates, float64_computation=False, omit_empty_coords_check=True)
            perf_gpu2 = benchmark(convex_hull_image, (image,), kwargs=kwargs, n_warmup=10, max_duration=max_duration)
            duration_gpu2 = float(perf_gpu2.gpu_times.mean())

            accel_f64 = duration_cpu / duration_gpu
            accel_f32 = duration_cpu / duration_gpu2
            print(f"| {shape} | {im_type} | {offset_coordinates} | {duration_cpu:0.5f} | {duration_gpu:0.5f} | {duration_gpu2:0.5f} | {accel_f64:0.2f} | {accel_f32:0.2f} |")

@grlee77 grlee77 added improvement Improves an existing functionality non-breaking Introduces a non-breaking change performance Performance improvement labels Feb 7, 2025
@grlee77 grlee77 added this to the v25.04.00 milestone Feb 7, 2025
@grlee77 grlee77 requested a review from a team as a code owner February 7, 2025 19:36
* observed that a 2D planar object within a 3D volume can cause QHull to fail
* should also be more efficient to process such 'thin' objects in a reduced dimension
* original shape is restored for the returned convex hull image
Copy link
Contributor

@gigony gigony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @grlee77 for the convex hall implementations!
It looks good to me in general. I left some feedback and suggestions in the comments.

"Returning empty image",
UserWarning,
)
return np.zeros(image.shape, dtype=bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we want to return CuPy array here?

Suggested change
return np.zeros(image.shape, dtype=bool)
return cp.zeros(image.shape, dtype=bool)

raise ValueError("`connectivity` must be between 1 and image.ndim.")

labeled_im = label(image, connectivity=connectivity, background=0)
convex_obj = cp.zeros(image.shape, dtype=bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this variable is always overwritten or not used so can be removed.

Suggested change
convex_obj = cp.zeros(image.shape, dtype=bool)

chimage3d = convex_hull_image(
image3d, offset_coordinates=True, cpu_fallback_threshold=0
)
chimage3d.sum() > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
chimage3d.sum() > 0
assert chimage3d.sum() > 0

The parameters listed under "Extra Parameters" above are present only
in cuCIM and not in scikit-image.
"""
if connectivity not in tuple(range(1, image.ndim + 1)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no validation that the input is 2D or 3D. While the error will be caught by the connectivity check, it might be clearer to add an explicit dimensionality check:

Suggested change
if connectivity not in tuple(range(1, image.ndim + 1)):
if image.ndim not in (2, 3):
raise ValueError("Input image must be 2D or 3D")
if connectivity not in tuple(range(1, image.ndim + 1)):

.. [1] https://blogs.mathworks.com/steve/2011/10/04/binary-image-convex-hull-algorithm-notes/

"""
if cpu_fallback_threshold is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth adding validation that the threshold is non-negative when provided:

Suggested change
if cpu_fallback_threshold is None:
if cpu_fallback_threshold is not None and cpu_fallback_threshold < 0:
raise ValueError("cpu_fallback_threshold must be non-negative")
if cpu_fallback_threshold is None:

to numerical floating point errors, a tolerance of 0 can result in
some points erroneously being classified as being outside the hull.
include_borders: bool, optional
If ``False``, vertices/edges are excluded from the final hull mask.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since include_borders=False case raises the NotImplementedError, it might be good to document this in the docstring.

Comment on lines +33 to +37
def _offsets_diamond(ndim):
offsets = np.zeros((2 * ndim, ndim))
for vertex, (axis, offset) in enumerate(product(range(ndim), (-0.5, 0.5))):
offsets[vertex, axis] = offset
return cp.asarray(offsets)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if np.zeros() is used here because calculating offsets in system memory and then converting them to GPU memory is more efficient than creating or manipulating a CuPy array.

Comment on lines +232 to +233
# float32 will be used if coord_dtype is <= 16-bit
# otherwise, use float64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is slightly misleading as promote_types behavior depends on the specific dtype, not just its size. It would be better to explicitly document the exact promotion rules or use more explicit logic.

.. [1] https://blogs.mathworks.com/steve/2011/10/04/binary-image-convex-hull-algorithm-notes/

"""
if cpu_fallback_threshold is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if cpu_fallback_threshold is None:
if tolerance <= 0:
raise ValueError("tolerance must be positive")
if cpu_fallback_threshold is None:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change performance Performance improvement
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

2 participants