Object Detection

Context: As of this year, URC Rules specify two new waypoints during the Autonomy mission that require the Rover to detect and navigate toward two objects placed on the ground.

The 2 objects will have GNSS coordinates within the vicinity of the objects (<10 m). Autonomous detection of the tools will be required. The first object will be an orange rubber mallet. The second object will be a standard 1 L wide-mouthed plastic water bottle of unspecified color/markings (approximately 21.5 cm tall by 9 cm diameter).

Currently, the perception system does not support detection of objects besides ARTags so we must experiment with and implement such a detection system. One way to do this is using a learning-based instance segmentation model such as YOLO to extract the mallet and waterbottle from the ZED camera feed.

**Interface** (Subject to change)

Node: detect_objects

Subscribes: sensor_msgs/Image

Publishes:

Object.msg

string object_type
float32 detection_confidence
float32 image_x
float32 image_y

Rough Steps:

Run a pre-trained image segmentation model (such as YOLO) to detect the objects
Create a subscriber for the Image topic
Write a function that takes an Image and passes it into the model to detect the objects
Create a publisher for the Object topic
Write a function that publishes the detected Objects from the Image message to the Object topic
Revisit the image segmentation model and fine-tune with our own images if necessary

We also need a node to put the detected object(s) into the TF tree so the rover can navigate toward them. This involves translating the tag from image_x, image_y pixel space to an SE3 pose relative to the rover. One possible way of doing this is determining the depth reading at image_x, image_y and navigating toward that point much like we do with ARTags. Another possible solution is to project the object into the ground plane and using the ground plane estimate and relative location of the rover to determine the pose of the object of interest.

Node: get_object_pose

Subscribes: Object.msg topic published by detect_objects

Either ZED depth or ground plane estimation

Publishes: SE3 pose to the TF tree