Skip to content

Commit

Permalink
add bbox check to check dataset script (#54)
Browse files Browse the repository at this point in the history
* add bbox check to check dataset script

* fix doc
  • Loading branch information
pj-ms authored Apr 11, 2023
1 parent 0195e0c commit 6a99d8c
Show file tree
Hide file tree
Showing 5 changed files with 77 additions and 37 deletions.
40 changes: 25 additions & 15 deletions COCO_DATA_FORMAT.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ In coco, we use `file_name` and `zip_file` to construct the file_path in `ImageD

Here is one example of the train.json, val.json, or test.json in the `DatasetInfo` above. Note that the `"id"` for `images`, `annotations` and `categories` should be consecutive integers, **starting from 1**. Note that our lib might work with id starting from 0, but many tools like [CVAT](https://github.com/openvinotoolkit/cvat/issues/2085) and official [COCOAPI](https://github.com/cocodataset/cocoapi/issues/507) will fail.

``` {json}
```json
{
"images": [{"id": 1, "width": 224.0, "height": 224.0, "file_name": "train_images/siberian-kitten.jpg", "zip_file": "train_images.zip"},
{"id": 2, "width": 224.0, "height": 224.0, "file_name": "train_images/kitten 3.jpg", "zip_file": "train_images.zip"}],
Expand All @@ -22,7 +22,7 @@ Here is one example of the train.json, val.json, or test.json in the `DatasetInf

## Object detection

``` {json}
```json
{
"images": [{"id": 1, "width": 224.0, "height": 224.0, "file_name": "train_images/siberian-kitten.jpg", "zip_file": "train_images.zip"},
{"id": 2, "width": 224.0, "height": 224.0, "file_name": "train_images/kitten 3.jpg", "zip_file": "train_images.zip"}],
Expand All @@ -46,7 +46,7 @@ Note that

Here is one example of the json file for image caption task.

``` {json}
```json
{
"images": [{"id": 1, "file_name": "train_images/honda.jpg", "zip_file": "train_images.zip"},
{"id": 2, "file_name": "train_images/kitchen.jpg", "zip_file": "train_images.zip"}],
Expand All @@ -62,7 +62,7 @@ Here is one example of the json file for image caption task.

Here is one example of the json file for image text matching task. `match: 1` indicates image and text match.

``` {json}
```json
{
"images": [{"id": 1, "file_name": "train_images/honda.jpg", "zip_file": "train_images.zip"},
{"id": 2, "file_name": "train_images/kitchen.jpg", "zip_file": "train_images.zip"}],
Expand All @@ -84,7 +84,7 @@ Here is one example of the json file for image matting task. The "label" in the

Specifically, **only** image files are supported for the label files. The ground turth image should be one channel image (i.e. `PIL.Image` mode "L", instead of "RGB") that has the same width and height with the image file. Refer to the images in [tests/image_matting_test_data.zip](tests/image_matting_test_data.zip) as an example.

``` {json}
```json
{
"images": [{"id": 1, "file_name": "train_images/image/test_1.jpg", "zip_file": "train_images.zip"},
{"id": 2, "file_name": "train_images/image/test_2.jpg", "zip_file": "train_images.zip"}],
Expand All @@ -99,7 +99,7 @@ Specifically, **only** image files are supported for the label files. The ground

Here is one example of the json file for the image regression task, where the "target" in the "annotations" field is a real-valued number (e.g. a score, an age, etc.). Note that each image should only have one regression target (i.e. there should be exactly one annotation for each image).

``` {json}
```json
{
"images": [{"id": 1, "width": 224.0, "height": 224.0, "file_name": "train_images/image_1.jpg", "zip_file": "train_images.zip"},
{"id": 2, "width": 224.0, "height": 224.0, "file_name": "train_images/image_2.jpg", "zip_file": "train_images.zip"}],
Expand All @@ -111,28 +111,38 @@ Here is one example of the json file for the image regression task, where the "t
```

## Image retrieval
This is an example of a JSON file for the image retrieval task. This format is similar to the image_caption and image_text_matching dataset formats, as it contains text associated with images. However, there are some important differences:

1. The file may contain an optional "categories" section, which defines both a category name and an optional super category. This allows for a hierarchical structure in the data, which cannot be achieved with only the query field.
2. Unlike the other two dataset formats, the annotations in this file contain a "query" field rather than a "text" or "caption" field. Each image is associated with a query as well as a category_id.
This task represents data of images retrieved by text queries.

The category_id can provide additional information related to the nature of the image. For example, an image can belong to a group of images (called a supercategory, e.g. "race") and within that group fall into a subgroup (e.g. "white" or "black"). Overall, this format allows for more complex and nuanced associations between images and text than other formats, due to the hierarchical structure provided by the category section.
```json
{
"images": [
{"id": 1, "zip_file": "test1.zip", "file_name": "test/0/image_1.jpg"},
{"id": 2, "zip_file": "test2.zip", "file_name": "test/1/image_2.jpg"}
],
"annotations": [
{"image_id": 1, "id": 1, "query": "Men eating a banana."},
{"image_id": 2, "id": 2, "query": "An apple on the desk."}
]
}
```

The retrieved images might come with additional classification data associated with images in the annotation field mixed up with query annotation. This might change in future, as it can be achieved by using multitask dataset concept with one solely for image retrieval, while the other one solely for classification.


``` {json}
```json
{
"images": [
{"id": 1, "zip_file": "test1.zip", "file_name": "test/0/image_1.jpg"},
{"id": 2, "zip_file": "test2.zip", "file_name": "test/1/image_2.jpg"}
],
"categories": [
{"id": 1, "name": "white", "supercategory": "race"},
{"id": 2, "name": "black", "supercategory": "race"}
{"id": 1, "name": "banana", "supercategory": "fruit"},
{"id": 2, "name": "apple", "supercategory": "fruit"}
],
"annotations": [
{"image_id": 1, "id": 1, "category_id": 1, "query": "european men giving a speech"},
{"image_id": 2, "id": 2, "category_id": 2, "query": "african-american men giving a speech"}
{"image_id": 1, "id": 1, "category_id": 1, "query": "Men eating a banana."},
{"image_id": 2, "id": 2, "category_id": 2, "query": "An apple on the desk."}
]
}
```
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Currently, seven `basic` types of data are supported:
- `image_text_matching`: each image is associated with a collection of texts describing the image, and whether each text description matches the image or not.
- `image_matting`: each image has a pixel-wise annotation, where each pixel is labeled as 'foreground' or 'background'.
- `image_regression`: each image is labeled with a real-valued numeric regression target.
- `image_retrieval`: each image is labeled with a number of text queries describing the image. optionally an image is associated with one label.
- `image_retrieval`: each image is labeled with a number of text queries describing the image. Optionally, an image is associated with one label.

`multitask` type is a composition type, where one set of images has multiple sets of annotations available for different tasks, where each task can be of any basic type.

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import setuptools
from os import path

VERSION = '0.2.26'
VERSION = '0.2.27'

# Get the long description from the README file
here = path.abspath(path.dirname(__file__))
Expand Down
46 changes: 37 additions & 9 deletions vision_datasets/commands/check_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ def quick_check_images(dataset: ManifestDataset):
show_img(dataset[idx])


def check_images(dataset: ManifestDataset, err_msg_file: pathlib.Path):
def check_images(dataset: ManifestDataset):
show_dataset_stats(dataset)
file_not_found_list = []
for i in tqdm(range(len(dataset)), 'Checking image access..'):
Expand All @@ -45,18 +45,42 @@ def check_images(dataset: ManifestDataset, err_msg_file: pathlib.Path):
file_not_found_list.append(str(e))

if file_not_found_list:
logger.info(f'Errors => {err_msg_file.as_posix()}')
err_msg_file.write_text('\n'.join(file_not_found_list), encoding='utf-8')
return ['Files not accessible: ' + (', '.join(file_not_found_list))]

return []


def _is_integer(bbox):
return all([isinstance(x, int) or (isinstance(x, float) and x.is_integer()) for x in bbox])


def check_box(bbox, img_w, img_h):
if len(bbox) != 4 or not _is_integer(bbox):
return False

l, t, r, b = bbox
return l >= 0 and t >= 0 and l < r and t < b and r <= img_w and b <= img_h


def classification_detection_check(dataset: ManifestDataset):
errors = []
n_imgs_by_class = {x: 0 for x in range(len(dataset.labels))}
for sample in dataset.dataset_manifest.images:
for sample_idx, sample in enumerate(dataset.dataset_manifest.images):
labels = sample.labels
c_ids = set([label[0] if dataset.dataset_info.type == DatasetTypes.OD else label for label in labels])
for c_id in c_ids:
n_imgs_by_class[c_id] += 1

if dataset.dataset_info.type == DatasetTypes.OD:
w, h = sample.width, sample.height
if not w or not h or w < 0 or h < 0:
errors.append(f'Image {sample_idx} has invalid width or height: {w}, {h}')
continue

for box_id, box in enumerate(labels):
if not check_box(box[1:], w, h):
errors.append(f'Image {sample_idx}, box {box_id} is invalid: {box}\n')

c_id_with_max_images = max(n_imgs_by_class, key=n_imgs_by_class.get)
c_id_with_min_images = min(n_imgs_by_class, key=n_imgs_by_class.get)
mean_images = sum(n_imgs_by_class.values()) / len(n_imgs_by_class)
Expand All @@ -79,6 +103,8 @@ def classification_detection_check(dataset: ManifestDataset):
plt.show()
logger.info(str(stats))

return errors


def main():
parser = argparse.ArgumentParser('Check if a dataset is valid for pkg to consume.')
Expand All @@ -103,16 +129,18 @@ def main():
logger.info(f'{prefix} Check dataset with usage: {usage}.')

# if args.local_dir is none, then this check will directly try to access data from azure blob. Images must be present in uncompressed folder on azure blob.
dataset = dataset_hub.create_manifest_dataset(container_sas=args.blob_container, local_dir=args.local_dir, name=dataset_info.name, version=args.version, usage=usage)
dataset = dataset_hub.create_manifest_dataset(container_sas=args.blob_container, local_dir=args.local_dir, name=dataset_info.name, version=args.version, usage=usage, coordinates='absolute')
if dataset:
err_msg_file = pathlib.Path(f'{args.name}_{usage}_errors.txt')
errors = []
if args.data_type in [DatasetTypes.IC_MULTICLASS, DatasetTypes.IC_MULTILABEL, DatasetTypes.OD]:
errors.extend(classification_detection_check(dataset))

if args.quick_check:
quick_check_images(dataset)
else:
check_images(dataset, err_msg_file)

if args.data_type in [DatasetTypes.IC_MULTICLASS, DatasetTypes.IC_MULTILABEL, DatasetTypes.OD]:
classification_detection_check(dataset)
errors.extend(check_images(dataset))
err_msg_file.write_text('\n'.join(errors), encoding='utf-8')
else:
logger.info(f'{prefix} No split for {usage} available.')

Expand Down
24 changes: 13 additions & 11 deletions vision_datasets/common/data_manifest.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ def __init__(self, id, img_path, width, height, labels, label_file_paths=None, l
height (int): image height
labels (list or dict):
classification: [c_id] for multiclass, [c_id1, c_id2, ...] for multilabel;
detection: [[c_id, left, top, right, bottom], ...];
detection: [[c_id, left, top, right, bottom], ...] (absolute coordinates);
image_caption: [caption1, caption2, ...];
image_text_matching: [(text1, match (0 or 1), text2, match (0 or 1), ...)];
multitask: dict[task, labels];
Expand All @@ -121,7 +121,7 @@ def __init__(self, id, img_path, width, height, labels, label_file_paths=None, l
def labels(self):
if self._labels:
return self._labels
elif self.label_file_paths:
elif self.label_file_paths: # lazy load only for image matting
file_reader = FileReader()
self._labels = []
for label_file_path in self.label_file_paths:
Expand Down Expand Up @@ -150,7 +150,7 @@ def __init__(self, images: List[ImageDataManifest], labelmap, data_type):
data_type (str or dict) : data type, or data type by task name
"""
assert data_type != DatasetTypes.MULTITASK, 'For multitask, data_type should be a dict mapping task name to concrete data type.'
assert data_type and data_type != DatasetTypes.MULTITASK, 'For multitask, data_type should be a dict mapping task name to concrete data type.'

if isinstance(labelmap, dict):
assert isinstance(data_type, dict), 'labelmap being a dict indicating this is a multitask dataset, however the data_type is not a dict.'
Expand All @@ -167,6 +167,7 @@ def create_dataset_manifest(dataset_info, usage: str, container_sas_or_root_dir:

if dataset_info.data_format == Formats.IRIS:
return IrisManifestAdaptor.create_dataset_manifest(dataset_info, usage, container_sas_or_root_dir)

if dataset_info.data_format == Formats.COCO:
container_sas_or_root_dir = _construct_full_url_or_path_generator(container_sas_or_root_dir, dataset_info.root_folder)('')
if dataset_info.type == DatasetTypes.MULTITASK:
Expand Down Expand Up @@ -828,15 +829,16 @@ def create_dataset_manifest(coco_file_path_or_url: Union[str, dict, pathlib.Path

file_reader.close()

def get_file_path(info_dict: dict, file_name):
def append_zip_prefix_if_needed(info_dict: dict, file_name):
zip_prefix = info_dict.get('zip_file', '')
if zip_prefix:
zip_prefix += '@'

return get_full_sas_or_path(zip_prefix + file_name)

images_by_id = {img['id']: ImageDataManifest(img['id'], get_file_path(img, img['file_name']), img.get('width'), img.get('height'), [], {}) for img in coco_manifest['images']}
images_by_id = {img['id']: ImageDataManifest(img['id'], append_zip_prefix_if_needed(img, img['file_name']), img.get('width'), img.get('height'), [], {}) for img in coco_manifest['images']}
process_labels_without_categories = None

if data_type == DatasetTypes.IMCAP:
def process_labels_without_categories(image):
image.labels.append(annotation['caption'])
Expand All @@ -846,7 +848,7 @@ def process_labels_without_categories(image):
elif data_type == DatasetTypes.IMAGE_MATTING:
def process_labels_without_categories(image):
image.label_file_paths = image.label_file_paths or []
image.label_file_paths.append(get_file_path(annotation, annotation['label']))
image.label_file_paths.append(append_zip_prefix_if_needed(annotation, annotation['label']))
elif data_type == DatasetTypes.IMAGE_REGRESSION:
def process_labels_without_categories(image):
assert len(image.labels) == 0, f"There should be exactly one label per image for image_regression datasets, but image with id {annotation['image_id']} has more than one"
Expand All @@ -861,12 +863,12 @@ def process_labels_without_categories(image):
images = [x for x in images_by_id.values()]
return DatasetManifest(images, None, data_type)

supercategory_field_in_categories = False
if len(coco_manifest['categories']) > 0 and 'supercategory' in coco_manifest['categories'][0]:
supercategory_field_in_categories = True
supercategory_field_in_categories = len(coco_manifest['categories']) > 0 and 'supercategory' in coco_manifest['categories'][0]
if supercategory_field_in_categories:
cate_id_name = [(cate['id'], cate['name'], cate['supercategory']) for cate in coco_manifest['categories']]
else:
cate_id_name = [(cate['id'], cate['name']) for cate in coco_manifest['categories']]

cate_id_name.sort(key=lambda x: x[0])
label_id_to_pos = {x[0]: i for i, x in enumerate(cate_id_name)}
if supercategory_field_in_categories:
Expand All @@ -882,8 +884,8 @@ def process_labels_without_categories(image):
img = images_by_id[annotation['image_id']]
if 'bbox' in annotation:
bbox = annotation['bbox']
if bbox_format == BBoxFormat.LTWH:
bbox = [bbox[0], bbox[1], bbox[0] + bbox[2], bbox[1] + bbox[3]]
bbox = bbox if bbox_format == BBoxFormat.LTRB else [bbox[0], bbox[1], bbox[0] + bbox[2], bbox[1] + bbox[3]]

label = [c_id] + bbox
img.labels_extra_info['iscrowd'] = img.labels_extra_info.get('iscrowd', [])
img.labels_extra_info['iscrowd'].append(annotation.get('iscrowd', 0))
Expand Down

0 comments on commit 6a99d8c

Please sign in to comment.