Image Processing

Overview

The chosen pipeline for processing the captured frames is simple yet effective, consisting of the following layers:

Color Filtering:

Removes green and blue channels to isolate the red component.

Thresholding:

Segments the object of interest.

Morphological Filtering:

Applies erosion followed by dilation to remove residual noise.

Despite its simplicity, this approach achieves strong segmentation of the target object.

Screenshot of the final result

Color Filtering

Color filtering is a technique used to selectively enhance or suppress specific color channels in an image. In this pipeline, the RGB color space is employed to emphasize the red channel and reduce background noise, thereby increasing contrast between the object of interest (the pendulum) and the background.

Since the pendulum is represented by a Pokéball (a sphere with red and white hemispheres), isolating the red channel effectively enhances the object and diminishes background interference.

filtered_im[:, :, 0] = np.clip(filtered_im[:, :, 0] * blue_filter, 0, 255)
filtered_im[:, :, 1] = np.clip(filtered_im[:, :, 1] * green_filter, 0, 255)
filtered_im[:, :, 2] = np.clip(filtered_im[:, :, 2] * red_filter, 0, 255)

Threshold Filtering

The next step on our pipeline is Thresholding!

This is one of the simplest segmentation techniques. It separates the object based on intensity differences, basically turning pixels above a certain threshold white and all others black.

# Convert to grayscale and apply threshold
gray_im = cv2.cvtColor(filtered_im, cv2.COLOR_BGR2GRAY)

_, thresholded_im = cv2.threshold(gray_im, threshold_value_setting, max_binary_value, threshold_type_value)

OpenCV provides various thresholding methods (e.g., Binary, Inverted, Truncate). We chose Binary Thresholding, where pixels in gray_im above threshold_value_setting are set to 255, and others to 0. While effective, residual grainy noise is often present after thresholding, which we handle with erosion and dilation.

Erosion Followed by Dilation (Opening)

Erosion and dilation are complementary morphological operations that operate on binary images. Erosion removes pixels on object boundaries, while dilation adds pixels to these boundaries. When combined (erosion followed by dilation), they form an opening operation, which is useful for removing noise smaller than the structuring element.

The opening operation is defined as:

$$ A∘B=(A⊖B)⊕B $$

where A is the input image, and B is the structuring element. This operation removes noise artifacts while preserving the main object's shape.

The code below performs opening:

# Apply morphological transformations
element = cv2.getStructuringElement(morph_shape_val, (2 * kernel_val + 1, 2 * kernel_val + 1), (kernel_val, kernel_val))

eroded_im = cv2.erode(thresholded_im, element)
dilated_im = cv2.dilate(eroded_im, element)

The morph_shape_val for cv2.getStructuringElement can be cv2.MORPH_RECT, cv2.MORPH_CROSS, or cv2.MORPH_ELLIPSE. Here, cv2.MORPH_ELLIPSE was selected for its effectiveness after testing.

Example with cv2.MORPH_ELLIPSE and kernel_val = 3:

[[0 0 0 1 0 0 0]
 [0 1 1 1 1 1 0]
 [1 1 1 1 1 1 1]
 [1 1 1 1 1 1 1]
 [1 1 1 1 1 1 1]
 [0 1 1 1 1 1 0]
 [0 0 0 1 0 0 0]]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image Processing

Overview

Color Filtering

Threshold Filtering

Erosion Followed by Dilation (Opening)

References

Clone this wiki locally