Clever labeling

Clever labeling (CL) helps to create labeling for object detection task.

CL tested together with Yolo_mark (https://github.com/AlexeyAB/Yolo_mark) for labeling.

How it works

You start labeling your data with Yolo_mark.
CL trains YOLOv5 on your labeled data so far.
If mAP@:.5:.95 is more than 0.8 (you can set your value) after training then CL creates bboxes for images that you did't label before the current moment.
If mAP@:.5:.95 is less than 0.8 than CL will wait for 20 minutes (you can set your value) and try again with data that you labeled so far.

Install

git clone https://github.com/denred0/clever_labeling.git
cd clever_labeling
pip install -r requirements.txt

Data Preparation

Create a folder "data". Create a folder with name of your project inside "data" folder. For example, "animals_detection".
Create file with list of classes "classes.txt" inside your project folder.
Example of structure of "classes.txt" file:
dog
cat
pig
Also you can see example of "classes.txt" in data/sample_project
Create folder "dataset" inside your project folder and copy images for labeling to "dataset" folder.
As a result your catalog should be like this:
clever_labeling
├── data
│ ├── animals_detection
│ │ ├── classes.txt
│ │ ├── dataset
│ │ │ ├── image1.jpg
│ │ │ ├── image2.jpg
│ │ │ ├── image3.jpg
Copy "labeling_config.yaml" from data/sample_project to folder with your project. You can configure training and pseudolabeling of your project using this "labeling_config.yaml".

Prepare dataset

Run script prepare_dataset.py

python src/prepare_dataset.py %project_folder_name% 
python src/prepare_dataset.py animals_detection

It will create folder "labeling" with subfolder for every class.
You will labeling every class separately. I noticed that it is more precise and convenient. Subfolder for every class will have only one class for labeling with index 0. When you markup all classes you can merge all txts together and every class will have own index according "classes.txt". You can find the merging process in Merging results part of this tutorial.

prepare_dataset.py has additional parameter --upd_txts that means that you want to create txts for every class and fill them with values from data/animals_detection/dataset. But be careful It rewrites txts for classes if they existed before.

Training

To start training run:

python src/train.py %project_folder_name% %class_name%
python src/train.py animals_detection dog

This script creates a folder "animals_detection/labeling/dog/training" and all training results are saved to this folder.

Additional parameters:
--resume_weights - path to weights to resume training

You can change params in labeling_config.yaml between training attempts.

Pseudo labeling

To start training run:

python src/labeling.py %project_folder_name% %class_name%
python src/labeling.py animals_detection dog

Additional parameters:
--count_of_images_to_markup - count of images to markup if mAP will be greater than min mAP.
--th - min threshold for label sample.
--nms - nonmax suspression for detection
--exp - use weights of specific experiment

It will markup count_of_images_to_markup that you didn't markup so far.

If you don't like results of labeling you can run src/train.py again. In this case src/train.py takes all data that you markup so far.

Merging results

You labeled all classes separately and can merge results.
Run script merge_labels.py

python src/merge_labels.py %project_folder_name% 
python src/merge_labels.py animals_detection

This script creates a folder animals_detection/merge with resulting markup.

Additionally you can check your dataset for collision. Check sample_project/merge_config.yaml to configure merging process:
classes_to_merge - classes that you want to merge. You can set [] - it means that all classes should be merged.
obligatory_classes - check that every image has markup for specific class.
iou - check that bboxes doesn't have overlapping more that this value.

src/merge_labels.py creates folder for every collision case: 1_high_iou, 2_without_obligatory_classes, 3_empty_images (images without any bbox).
You can fix markup in every that folder consequentially and then start script src/update_and_merge.py %project_name% %upd%
upd parameter means type of update. Possible values: iou, obl (for obligatory_classes), emp (for empty images).

For example, you fixed markup inside folder 1_high_iou, It means that you should update markup and resulting dataset and run the script:

python src/update_and_merge.py animals_detection iou

Now you can fix markup in 2_without_obligatory_classes folder and etc.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
data		data
src		src
yolov5		yolov5
yolov5_weights		yolov5_weights
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clever labeling

How it works

Install

Data Preparation

Prepare dataset

Training

Pseudo labeling

Merging results

About

Releases

Packages

Languages

denred0/clever_labeling

Folders and files

Latest commit

History

Repository files navigation

Clever labeling

How it works

Install

Data Preparation

Prepare dataset

Training

Pseudo labeling

Merging results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages