Forked from here.
- libtorch - 1.5.0 GPU enabled version.
- OpenCV - 3.4.7
- wxWidgets - 3.0.0. Simply install using apt.
- CMake
- CUDA 10.1 with CUDNN 7.6.5.32 allows you to successfully compile. My driver was not compatible with the latest CUDA. I upgraded my driver to 435.21. The build could run.
mkdir build
cd build
CMAKE_PREFIX_PATH=/home/$USER/cpplibs/libtorch-1.5.0-gpu cmake ../
make -j8
- After building, there will be a
build/bin/GUI
andbuild/bin/processing
file. The GUI is used to visualize the results and processing is used to run a video. - A sample test video can be downloaded from TownCentreXVID.avi.
- In the root directory of the project, create a
weights
folder and putyolov3.weights
, which can be downloaded from Darknet. You should download YOLOv3-416, as this is what this repository uses. - Run
./build/bin/processing /path/to/TownCentreXVID.avi scale_factor
. A scale factor of 1 means original sized images are used. 4 means you're scaling the image down by 4. I need to check how the scaling is done in the code. - The results will be written to a
results
folder. ./build/bin/GUI
and simply select theresults
folder to visualize the results. It's a pretty neat tool.
Builds were run on a GTX 1060 with 6GB of RAM.
- Scale: 1. FPS: 2.6.
- Scale: 4. FPS: 17.0.
- A build using CUDA 8.0 and CUDNN 5.0 probably will not work against PyTorch 1.5.0. The error message said that you needed at least CUDA 9.0.
There are four modules in the project:
- Detection: YOLOv3
- Tracking: SORT and DeepSORT
- Processing: Run detection and tracking, then display and save the results (a compressed video, a few snapshots for each target)
- GUI: Display the results
A Libtorch implementation of the YOLO v3 object detection algorithm, written with modern C++.
The code is based on the walktree.
The config file in .\models can be found at Darknet.
I also merged SORT to do tracking.
A similar software in Python is here, which also rewrite form the most starred version and SORT
Recently I reimplement DeepSORT which employs another CNN for re-id. It seems it gives better result but also slows the program a bit. Also, a PyTorch version is available at ZQPei, thanks!
Currently on a GTX 1060 6G it consumes about 1G RAM and have 37 FPS.
The video I test is TownCentreXVID.avi.
With wxWidgets, I developed the GUI module for visualization of results.
Previously I used Dear ImGui. However, I do not think it suits my purpose.
This project uses pre-trained network weights from others
This project requires LibTorch, OpenCV, wxWidgets and CMake to build.
LibTorch can be easily integrated with CMake, but there are a lot of strange things...
On Ubuntu 16.04, I use apt install
to install the others. Everything is fine.
On Windows 10 + Visual Studio 2017, I use the latest stable version of the others from their official websites.
Here are some intermediate output from detection and tracking module: