Is there a way to save a Dataset natively, without polygon conversions? #373

thomasf1 · 2023-09-21T17:00:24Z

Search before asking

I have searched the Supervision issues and found no similar feature requests.

Question

Is there a way to save a Dataset natively, without polygon conversions? Would be great If there would be a way to save the dataset with the masks as a zip, all save formats (coco, yolo, voc) do some processing. Ideally there would be a way to natively save a dataset (maybe as a ZIP file).

My usecase is experimenting to get the polygon conversion when saving as yolo after a run just right and writing some custom code for it, so it would be helpful to have the "raw" state with the masks saved somewhere.

Additional

I might just have overlooked some obvious way of saving the dataset... Otherwise it is a feature request I guess.

github-actions · 2023-09-21T17:01:01Z

Hello there, thank you for opening an Issue ! 🙏🏻 The team was notified and they will get back to you asap.

SkalskiP · 2023-09-26T11:04:26Z

Hi, @thomasf1 👋🏻 Sounds interesting. Have you perhaps already thought about the structure of such a dataset? Would each mask be stored as a separate black-and-white photo? How would you store information about the class to which an object with a given mask belongs?

thomasf1 · 2023-09-27T08:14:16Z

Well, there is already a standard/solution for coco that´s somehow not being used by roboflow: RLE (_frString)

Also, not supporting masks causes other issues: When converting to Polygons, cutouts in masks seem often to be converted as a separate polygon.

Example: A person having this arms on his hips. The triangle shape inside the arms isn´t part of the person, but surrounded by it. The way supervisions translates that into polygons is as follows:

The outer shape of the person as a person polygon (reasonable, expected behaviour)
The inner cutout of background as another person polygon (blatantly wrong and counterproductive)

(not sure if that qualifies as a BUG or is somehow intended behaviour?)

thomasf1 · 2023-09-28T12:43:00Z

@SkalskiP There is some code here that should get you far: https://github.com/waspinator/pycococreator/blob/master/pycococreatortools/pycococreatortools.py

Also, in binary_mask_to_polygon, they seem to add some padding to avoid a problem supervision (and the roboflow app smart polygon) has: Creating polygons that reach to the image edges. Unfortunately, most polygons leave a 1+ px gap from the edge of the image when the masked subject goes beyond the edge of the image.

thomasf1 · 2023-09-28T17:12:16Z

In an Ideal world, both roboflow.com and supervision would use both masks and polygons and decide intelligently which one to use.

For instance segmentation:
Masks for:

Objects with cutouts
Objects that are overlapped by something else, cutting the mask into separate areas that are not connected (which in polygons results in several polygons, loosing the information that they are part of one Object)
Maybe very small Objects

Polygons for everything else (maybe configureable)

thomasf1 · 2023-10-04T21:02:49Z

@SkalskiP What do you think?

SkalskiP · 2023-10-05T10:54:44Z

@thomasf1 agree! Expanding COCO annotations format support is the easiest way to unlock that capability.

thomasf1 · 2023-10-19T17:41:51Z

Would you need any help with it?

MihaiDavid05 · 2023-11-04T10:15:36Z

Well, there is already a standard/solution for coco that´s somehow not being used by roboflow: RLE (_frString)

Also, not supporting masks causes other issues: When converting to Polygons, cutouts in masks seem often to be converted as a separate polygon.

Example: A person having this arms on his hips. The triangle shape inside the arms isn´t part of the person, but surrounded by it. The way supervisions translates that into polygons is as follows:

The outer shape of the person as a person polygon (reasonable, expected behaviour)

The inner cutout of background as another person polygon (blatantly wrong and counterproductive)

(not sure if that qualifies as a BUG or is somehow intended behaviour?)

In an Ideal world, both roboflow.com and supervision would use both masks and polygons and decide intelligently which one to use.

For instance segmentation: Masks for:

Objects with cutouts

Objects that are overlapped by something else, cutting the mask into separate areas that are not connected (which in polygons results in several polygons, loosing the information that they are part of one Object)

Maybe very small Objects

Polygons for everything else (maybe configureable)

@thomasf1 I opened an issue specifically for this hole preservation matter. I also built an exporter that deals with almost everything that you said above, You can find it here.

thomasf1 · 2023-11-04T11:09:01Z

@MihaiDavid05 Great :)

One thing I could not quite work out from the ReadMe:

From where to what does it export? I assume from Masks in Image/PNG format to Coco Masks(RLE annotations), right?

So, the way it works is
Mask image (one per class? Does it suppoort multiple classes? Or Inststance Segmentation?) -> Coco Masks (RLE) if the object has holes or multiple regions, otherwise Polygons

MihaiDavid05 · 2023-11-04T15:56:41Z

@thomasf1 Yes, that's true, from Image/PNG format to Coco Masks (rle or polygons) annotations. It supports instance segmentation, therefore multiple classes and multiple instances of the same class in an image. I will update the readme!

thomasf1 · 2023-11-07T09:25:34Z

@MihaiDavid05 Out of curiosity, what tool did you use to generate the masks? We´re currently using hasty.ai which only allows a png masks for "Semantic Segmentation (png)". I guess using the image mask format for instance segmentation would be quite difficult...

MihaiDavid05 · 2023-11-07T09:38:36Z

@thomasf1 Hi, I did not understand your question. Which generated masks are you talking about?

thomasf1 · 2023-11-07T17:36:19Z

@MihaiDavid05 Sorry, I meant which tool do you use to annotate your data and in turn export that as mask images...

MihaiDavid05 · 2023-11-07T17:57:39Z

@thomasf1, oh, I see. Currently, I'm only using already annotated base datasets, so I'm not using any tool to annotate raw images :) I might look into that!

ryouchinsa · 2023-11-21T17:54:08Z

We updated a script to convert the RLE mask with holes to the YOLO segmentation format.
#574 (comment)

SkalskiP · 2023-11-22T10:31:09Z

Hi, @ryouchinsa! 👋🏻 Does the YOLO format support masks with holes?

ryouchinsa · 2023-11-22T12:56:52Z

Hi, @SkalskiP,

Using the script general_json2yolo.py, you can convert the RLE mask with holes to the YOLO segmentation format.

The RLE mask is converted to a parent polygon and a child polygon using cv2.findContours().
The parent polygon points are sorted in clockwise order.
The child polygon points are sorted in counterclockwise order.
Detect the nearest point in the parent polygon and in the child polygon.
Connect those 2 points with narrow 2 lines.
So that the polygon with a hole is saved in the YOLO segmentation format.

def is_clockwise(contour):
    value = 0
    num = len(contour)
    for i, point in enumerate(contour):
        p1 = contour[i]
        if i < num - 1:
            p2 = contour[i + 1]
        else:
            p2 = contour[0]
        value += (p2[0][0] - p1[0][0]) * (p2[0][1] + p1[0][1]);
    return value < 0

def get_merge_point_idx(contour1, contour2):
    idx1 = 0
    idx2 = 0
    distance_min = -1
    for i, p1 in enumerate(contour1):
        for j, p2 in enumerate(contour2):
            distance = pow(p2[0][0] - p1[0][0], 2) + pow(p2[0][1] - p1[0][1], 2);
            if distance_min < 0:
                distance_min = distance
                idx1 = i
                idx2 = j
            elif distance < distance_min:
                distance_min = distance
                idx1 = i
                idx2 = j
    return idx1, idx2

def merge_contours(contour1, contour2, idx1, idx2):
    contour = []
    for i in list(range(0, idx1 + 1)):
        contour.append(contour1[i])
    for i in list(range(idx2, len(contour2))):
        contour.append(contour2[i])
    for i in list(range(0, idx2 + 1)):
        contour.append(contour2[i])
    for i in list(range(idx1, len(contour1))):
        contour.append(contour1[i])
    contour = np.array(contour)
    return contour

def merge_with_parent(contour_parent, contour):
    if not is_clockwise(contour_parent):
        contour_parent = contour_parent[::-1]
    if is_clockwise(contour):
        contour = contour[::-1]
    idx1, idx2 = get_merge_point_idx(contour_parent, contour)
    return merge_contours(contour_parent, contour, idx1, idx2)

def mask2polygon(image):
    contours, hierarchies = cv2.findContours(image, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_TC89_KCOS)
    contours_approx = []
    polygons = []
    for contour in contours:
        epsilon = 0.001 * cv2.arcLength(contour, True)
        contour_approx = cv2.approxPolyDP(contour, epsilon, True)
        contours_approx.append(contour_approx)

    contours_parent = []
    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx < 0 and len(contour) >= 3:
            contours_parent.append(contour)
        else:
            contours_parent.append([])

    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx >= 0 and len(contour) >= 3:
            contour_parent = contours_parent[parent_idx]
            if len(contour_parent) == 0:
                continue
            contours_parent[parent_idx] = merge_with_parent(contour_parent, contour)

    contours_parent_tmp = []
    for contour in contours_parent:
        if len(contour) == 0:
            continue
        contours_parent_tmp.append(contour)

    polygons = []
    for contour in contours_parent_tmp:
        polygon = contour.flatten().tolist()
        polygons.append(polygon)
    return polygons 

def rle2polygon(segmentation):
    if isinstance(segmentation["counts"], list):
        segmentation = mask.frPyObjects(segmentation, *segmentation["size"])
    m = mask.decode(segmentation) 
    m[m > 0] = 255
    polygons = mask2polygon(m)
    return polygons

The RLE mask.

The converted YOLO segmentation format.

To run the script, put the COCO JSON file coco_train.json into datasets/coco/annotations.
Run the script. python general_json2yolo.py
The converted YOLO txt files are saved in new_dir/labels/coco_train.

Edit use_segments and use_keypoints in the script.

if __name__ == '__main__':
    source = 'COCO'

    if source == 'COCO':
        convert_coco_json('../datasets/coco/annotations',  # directory with *.json
                          use_segments=True,
                          use_keypoints=False,
                          cls91to80=False)

To convert the COCO bbox format to YOLO bbox format.

use_segments=False,
use_keypoints=False,

To convert the COCO segmentation format to YOLO segmentation format.

use_segments=True,
use_keypoints=False,

To convert the COCO keypoints format to YOLO keypoints format.

use_segments=False,
use_keypoints=True,

This script originates from Ultralytics JSON2YOLO repository.
We hope this script would help your business.

ryouchinsa · 2023-11-24T04:26:46Z

Thanks for reviewing our script.
We checked whether YOLO can train polygon masks with holes with a small dataset.

Donut images and YOLO segmentation text files to confirm that YOLO can train polygon masks with holes.

thomasf1 · 2023-11-26T10:47:58Z

Thanks @ryouchinsa. Having a look currently and it seems to work great :)

SkalskiP · 2023-11-27T10:52:59Z

Hi, @ryouchinsa 👋🏻 Thanks a lot for that! 🙏🏻 Making that doable with Supervision is definitely on our roadmap. We simply do not have enough capacity to take care of it now.

If any of you would like to help us out an contribute, I would be really grateful.

thomasf1 · 2023-11-29T04:47:32Z

@SkalskiP @ryouchinsa I´ve tested the code and incorporated it into supervision. Working on a PR.

thomasf1 · 2023-11-29T05:05:11Z

Added the PR (excuse the sample image): #630

ryouchinsa · 2023-11-29T08:20:54Z

Hi @SkalskiP, I am sorry for late reply. I was working on the PR for ultralytics/JSON2YOLO.
ultralytics/JSON2YOLO#61
Now I started working on your supervision code to implement the COCO RLE to YOLO feature.

Hi @thomasf1, thanks for implementing my code onto supervision. I will read and check with my dataset.

ryouchinsa · 2023-11-29T12:13:58Z

I am trying to implement features which are implemented on JSON2YOLO.
The script has merge_multi_segment() and merges multiple polygons into one.

Does supervision correspond to multiple polygons in the COCO format?
It looks that it does not work and has an error in the function coco_annotations_to_detections() in the script supervision/dataset/formats/coco.py.

polygons = [
    np.reshape(
        np.asarray(image_annotation["segmentation"], dtype=np.int32), (-1, 2)
    )
    for image_annotation in image_annotations
]

COCO file with multiple polygons.

"annotations": [
    {
        "area": 594425,
        "bbox": [328, 834, 780, 2250],
        "category_id": 1,
        "id": 1,
        "image_id": 1,
        "iscrowd": 0,
        "segmentation": [
            [495, 987, 497, 984, 501, 983, 500, 978, 498, 962, 503, 937, 503, 926, 532, 877, 569, 849, 620, 834, 701, 838, 767, 860, 790, 931, 803, 963, 802, 972, 846, 970, 896, 969, 896, 977, 875, 982, 847, 984, 793, 987, 791, 1001, 783, 1009, 785, 1022, 791, 1024, 787, 1027, 795, 1041, 804, 1059, 811, 1072, 810, 1081, 800, 1089, 788, 1092, 783, 1098, 784, 1115, 780, 1120, 774, 1123, 778, 1126, 778, 1136, 775, 1140, 767, 1140, 763, 1146, 767, 1164, 754, 1181, 759, 1212, 751, 1264, 815, 1283, 839, 1303, 865, 1362, 880, 1442, 902, 1525, 930, 1602, 953, 1640, 996, 1699, 1021, 1773, 1039, 1863, 1060, 1920, 1073, 1963, 1089, 1982, 1102, 2013, 1107, 2037, 1107, 2043, 1099, 2046, 1097, 2094, 1089, 2123, 1074, 2137, 1066, 2153, 1033, 2172, 1024, 2166, 1024, 2166, 1023, 2129, 1019, 2093, 1004, 2057, 996, 2016, 1000, 1979, 903, 1814, 860, 1727, 820, 1647, 772, 1547, 695, 1637, 625, 1736, 556, 1854, 495, 1986, 459, 2110, 446, 1998, 449, 1913, 401, 1819, 362, 1720, 342, 1575, 328, 1440, 335, 1382, 348, 1330, 366, 1294, 422, 1248, 437, 1222, 450, 1190, 466, 1147, 482, 1107, 495, 1076, 506, 1019, 497, 1016],
            [878, 2293, 868, 2335, 855, 2372, 843, 2413, 838, 2445, 820, 2497, 806, 2556, 805, 2589, 809, 2622, 810, 2663, 807, 2704, 793, 2785, 772, 2866, 742, 2956, 725, 3000, 724, 3013, 740, 3024, 757, 3029, 778, 3033, 795, 3033, 812, 3032, 812, 3046, 803, 3052, 791, 3063, 771, 3069, 745, 3070, 733, 3074, 719, 3077, 702, 3075, 680, 3083, 664, 3082, 631, 3072, 601, 3061, 558, 3058, 553, 3039, 558, 3023, 566, 3001, 568, 2983, 566, 2960, 572, 2912, 571, 2859, 567, 2781, 572, 2698, 576, 2643, 583, 2613, 604, 2568, 628, 2527, 637, 2500, 636, 2468, 629, 2445, 621, 2423, 673, 2409, 726, 2388, 807, 2344, 878, 2293]
        ]
    }],

The script which converts COCO to YOLO using supervision.

import supervision as sv

sv.DetectionDataset.from_coco(
    images_directory_path= r"/Users/ryo/rcam/test_annotations/test/_test_min_polygon",
    annotations_path=r"/Users/ryo/rcam/test_annotations/test/_test_min_polygon/coco_train.json",
    force_masks=True
).as_yolo(
    images_directory_path=r"/Users/ryo/rcam/test_annotations/test/_test_min_polygon/move1",
    annotations_directory_path=r"/Users/ryo/rcam/test_annotations/test/_test_min_polygon/move2",
    data_yaml_path=r"/Users/ryo/rcam/test_annotations/test/_test_min_polygon/data.yaml"
)

SkalskiP · 2024-04-12T15:40:39Z

We just opened a new issue #1114 where we proposed adding RLE support to Supervision datasets. To keep it clean and prevent duplications, I'm closing this issue. Also, if any of you would be willing to help us out with implementation, let us know! 🙏🏻

SkalskiP · 2024-04-12T15:48:41Z

@ryouchinsa we do not support disjointed masks for now, but we are actually working on the PR that may add this #1086

thomasf1 added the question Further information is requested label Sep 21, 2023

SkalskiP added enhancement New feature or request api:datasets Dataset API and removed question Further information is requested labels Sep 22, 2023

SkalskiP self-assigned this Sep 22, 2023

SkalskiP added this to Supervision Board Sep 22, 2023

SkalskiP mentioned this issue Nov 15, 2023

From binary masks with holes to COCO JSON format #574

Closed

2 tasks

LinasKo mentioned this issue Apr 12, 2024

[DetectionDataset] - extend from_coco and as_coco with support for masks in RLE format #1114

Closed

SkalskiP closed this as completed Apr 12, 2024

github-project-automation bot moved this to Current Release: Done in Supervision Board Apr 12, 2024

SkalskiP mentioned this issue Apr 12, 2024

Fix detections_to_coco_annotations function for empty polygons. #1086

Draft

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a way to save a Dataset natively, without polygon conversions? #373

Is there a way to save a Dataset natively, without polygon conversions? #373

thomasf1 commented Sep 21, 2023

github-actions bot commented Sep 21, 2023

SkalskiP commented Sep 26, 2023

thomasf1 commented Sep 27, 2023 •

edited

Loading

thomasf1 commented Sep 28, 2023 •

edited

Loading

thomasf1 commented Sep 28, 2023

thomasf1 commented Oct 4, 2023

SkalskiP commented Oct 5, 2023

thomasf1 commented Oct 19, 2023

MihaiDavid05 commented Nov 4, 2023

thomasf1 commented Nov 4, 2023

MihaiDavid05 commented Nov 4, 2023 •

edited

Loading

thomasf1 commented Nov 7, 2023 •

edited

Loading

MihaiDavid05 commented Nov 7, 2023

thomasf1 commented Nov 7, 2023

MihaiDavid05 commented Nov 7, 2023

ryouchinsa commented Nov 21, 2023

SkalskiP commented Nov 22, 2023

ryouchinsa commented Nov 22, 2023 •

edited

Loading

ryouchinsa commented Nov 24, 2023

thomasf1 commented Nov 26, 2023

SkalskiP commented Nov 27, 2023

thomasf1 commented Nov 29, 2023

thomasf1 commented Nov 29, 2023

ryouchinsa commented Nov 29, 2023

ryouchinsa commented Nov 29, 2023 •

edited

Loading

SkalskiP commented Apr 12, 2024

SkalskiP commented Apr 12, 2024

Is there a way to save a Dataset natively, without polygon conversions? #373

Is there a way to save a Dataset natively, without polygon conversions? #373

Comments

thomasf1 commented Sep 21, 2023

Search before asking

Question

Additional

github-actions bot commented Sep 21, 2023

SkalskiP commented Sep 26, 2023

thomasf1 commented Sep 27, 2023 • edited Loading

thomasf1 commented Sep 28, 2023 • edited Loading

thomasf1 commented Sep 28, 2023

thomasf1 commented Oct 4, 2023

SkalskiP commented Oct 5, 2023

thomasf1 commented Oct 19, 2023

MihaiDavid05 commented Nov 4, 2023

thomasf1 commented Nov 4, 2023

MihaiDavid05 commented Nov 4, 2023 • edited Loading

thomasf1 commented Nov 7, 2023 • edited Loading

MihaiDavid05 commented Nov 7, 2023

thomasf1 commented Nov 7, 2023

MihaiDavid05 commented Nov 7, 2023

ryouchinsa commented Nov 21, 2023

SkalskiP commented Nov 22, 2023

ryouchinsa commented Nov 22, 2023 • edited Loading

ryouchinsa commented Nov 24, 2023

thomasf1 commented Nov 26, 2023

SkalskiP commented Nov 27, 2023

thomasf1 commented Nov 29, 2023

thomasf1 commented Nov 29, 2023

ryouchinsa commented Nov 29, 2023

ryouchinsa commented Nov 29, 2023 • edited Loading

SkalskiP commented Apr 12, 2024

SkalskiP commented Apr 12, 2024

thomasf1 commented Sep 27, 2023 •

edited

Loading

thomasf1 commented Sep 28, 2023 •

edited

Loading

MihaiDavid05 commented Nov 4, 2023 •

edited

Loading

thomasf1 commented Nov 7, 2023 •

edited

Loading

ryouchinsa commented Nov 22, 2023 •

edited

Loading

ryouchinsa commented Nov 29, 2023 •

edited

Loading