TABLE OF CONTENTS
- Create Dataset
1.1 Collect Images
1.2 Create Labels
1.3 Create Config File - Select Model
- Train
- Inference
Firstly, clone this repository and install requirements.txt
in a Python>=3.7.0 environment, including PyTorch>=1.7
git clone https://github.com/duongthao1218/yolov5.git
cd yolov5
pip3 install -r requirements.txt # install
-
Capture frames from video
-
--path_vid
: path to video input -
--path_imgs
: path to image folder output
python3 capture_vid.py --path_vid ./videos/*.mp4 --path_imgs ./dataset/cars
Once you have collected images, you will need to annotate the objects of interest to create a ground truth for your model to learn from.
Using labelImg
tool
- Install by PyPI
pip3 install labelImg
Or build from source link
Open labelImg tool
labelImg
- Step 1: Open image folder. (Example:
datasets/images/
) - Step 2: Choose annotation folder. (Example:
datasets/labels/
)
Notes: Images and labels should be obatianed in folders like below:
├── datasets
│ ├── cars ├── images ├── *.jpg
| | └── labels ├── *.txt
- Step 3: Choose label file format as YOLO.
- Step 4: Draw a bounding box and annotate label.
- Step 5: Save annotation.
- Step 6: Continue with next image.
Notes: Some shortkeys
Shortkey | Descriptions |
---|---|
Ctrl + S | Save annotation |
Ctrl + D | Copy the current label and rect box |
Ctrl + Shift + D | Delete the current image |
W | Create a rect box |
D | Next image |
A | Previous image |
Delete | Delete the selected rect box |
Ctrl + + | Zoom in |
Ctrl + - | Zoom out |
Annotation files (*.txt
) and label file (classes.txt
) are obtained in the same folder with images.
The *.txt file specifications are:
- One row per object
- Each row is
class
x_center
y_center
width
height
format. - Box coordinates must be in normalized xywh format (from 0 - 1). If your boxes are in pixels, divide x_center and width by image width, and y_center and height by image height.
- Class numbers are zero-indexed (start from 0).
Example:
The label file corresponding to the above image contains 3 cars (class 0)
Set up files and directory structure: to train the YOLOv5 model, we need to add a *.yaml
file to describe the parameters of our dataset.
- Step 1: Split your dataset by choosing how to disperse our data (for example, keep 80% data in the training set and 20% in the validation set).
python3 split_dataset.py --path ./datasets/cars
--path
: path to folder obtaining images
Directory & files structure will like below:
├── datasets
│ ├── cars ├── images ├── *.jpg
| | └── labels ├── *.txt
| | └── train.txt
| | └── val.txt
| | └── test.txt
| |
│ └── other datasets
- Step 2: Setup the config file by editing
./data/custom.yaml
The *.txt file specifications are:
path
is path to training and validation data.nc
is the number of classes.names
is list name of classes.
Example in ./data/custom.yaml
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ./datasets/cars # dataset root dir
train: train.txt # train images
val: val.txt # val images
test: test.txt # (optional)
nc: 1
# Classes
names:
0: car
Select a pretrained model to start training from. Read Pretrained Checkpoints table to see a full comparison of all models.
Pretrained Checkpoints table
Model | size (pixels) |
mAPval 50-95 |
mAPval 50 |
Speed CPU b1 (ms) |
Speed V100 b1 (ms) |
Speed V100 b32 (ms) |
params (M) |
FLOPs @640 (B) |
---|---|---|---|---|---|---|---|---|
YOLOv5n | 640 | 28.0 | 45.7 | 45 | 6.3 | 0.6 | 1.9 | 4.5 |
YOLOv5s | 640 | 37.4 | 56.8 | 98 | 6.4 | 0.9 | 7.2 | 16.5 |
YOLOv5m | 640 | 45.4 | 64.1 | 224 | 8.2 | 1.7 | 21.2 | 49.0 |
YOLOv5l | 640 | 49.0 | 67.3 | 430 | 10.1 | 2.7 | 46.5 | 109.1 |
YOLOv5x | 640 | 50.7 | 68.9 | 766 | 12.1 | 4.8 | 86.7 | 205.7 |
YOLOv5n6 | 1280 | 36.0 | 54.4 | 153 | 8.1 | 2.1 | 3.2 | 4.6 |
YOLOv5s6 | 1280 | 44.8 | 63.7 | 385 | 8.2 | 3.6 | 12.6 | 16.8 |
YOLOv5m6 | 1280 | 51.3 | 69.3 | 887 | 11.1 | 6.8 | 35.7 | 50.0 |
YOLOv5l6 | 1280 | 53.7 | 71.3 | 1784 | 15.8 | 10.5 | 76.8 | 111.4 |
YOLOv5x6 + [TTA][tta] |
1280 1536 |
55.0 55.8 |
72.7 72.7 |
3136 - |
26.2 - |
19.4 - |
140.7 - |
209.8 - |
- Step 1: Click model name and download it.
- Step 2: Move downloaded model to path
weights
directory.
python3 train.py --img 640 --batch 4 --epochs 100 --data data/custom.yaml --weights weights/yolov5s.pt
Describe some parse arguments:
--img
: define input image size--batch
: determine batch size--epochs
: define the number of training epochs.--data
: set the path to YAML file--cfs
: specify model configuration--weights
: specify a custom path to weights (pretrained model)
Notes: Training with GPU by adding parse argument --devide 0
All training results are saved to runs/train/ with incrementing run directories, i.e. runs/train/exp, runs/train/exp1 etc
.
python3 detect.py --source datasets/cars/images/0.jpg --weights runs/train/exp/weights/best.pt --data data/custom.yaml --conf-thres 0.45
Describe some parse arguments:
--source
: input--weights
: path to trained model--data
: set the path to YAML file--conf-thres
: confidence threshold
--source 0 # webcam
img.jpg # image
vid.mp4 # video
screen # screenshot
path/ # directory
list.txt # list of images
list.streams # list of streams
'path/*.jpg' # glob
'https://youtu.be/Zgi9g1ksQHc' # YouTube
'rtsp://example.com/media.mp4' # RTSP, RTMP, HTTP stream
All inferencing results are saved to runs/detect
Example:
REFERENCES