-
Notifications
You must be signed in to change notification settings - Fork 0
TilesPatcher
The TilesPatcher
node is responsible for collecting and merging the results from the tiled frames processed by the neural network.
- bounding box extraction (decoding of neural network output)
- scales and map the output bounding boxes to global image coordinates based on
tile_index
- performs Non-Maximum Suppression (NMS) (read more in paper or this Medium post) to eliminate duplicate detections and outputs the final set of bounding boxes
This method extracts bounding boxes and other necessary data from the neural network output. Bounding boxes are extracted as follows:
-
Confidence Filtering: Bounding boxes with confidence scores below
self.conf_thresh
are removed. -
Box Format: The bboxes are converted from center-based format
$(x, y, w, h)$ to corner-based format$(x_1, y_1, x_2, y_2)$ using utility methodxywh2xyxy(x)
.
This method adjusts the bounding boxes from the tile's local coordinates to the global image coordinates.
-
Retrieve Tile Information: The method gets the tile’s position and dimensions using
_get_tile_info
. - Adjust Coordinates: Bounding boxes are scaled and translated based on the tile's top-left corner and the scale factor used to resize the tile for the neural network.
The adjusted bounding box coordinates
Where
This node also performs a syncing mechanism. Because frames are sent sequentially (not in parallel), we can deduce some heuristics for the waiting of the tiles.
- Tiles are sent sequentially to the network and they are outputed by the network in a sequential fashion.
- We can get the information about how many tiles are we expecting.
Hence, we only send the bounding boxes once we either:
- get all the expected tiles
- the timestamp of the newly received tile differs from the ones in buffer
Once all tiles for a frame are processed, the method sends the final set of bounding boxes after performing Non-Maximum Suppression (NMS).
- Merging Bounding Boxes: The bounding boxes from all tiles are merged into one list.
-
Non-Maximum Suppression: NMS is applied to remove overlapping boxes with high IoU (above
$\text{iou\_thresh}$ ), retaining only the box with the highest confidence score.
For more information about NMS, read here.
The final bounding boxes are then sent as output.