This project is the result of training the deeplab_V3 model on Baidu apollo road marking data. This project solves the problem of unbalanced training data by setting weights, and uses numpy and opencv to solve the problem that the annotation data cannot be directly used. In the project I provided a script that merges the visualization results into a video file. The project also provides the already packed TFRecord data and the weights set to resolve the imbalance of training data.
The following GIF is the network prediction result
- tensorflow >=1.6
pip install tensorflow-gpu
- python-pil python-numpy jupyter matplotlib
pip install python-pil python-numpy jupyter matplotlib
- opencv3 for python2.7
pip install opencv3
Run the test command in the research folder and restart
# From tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
Test mode_test for normal operation.
python deeplab/model_test.py
I put about 150GB of training data separately, you can get the training data in the way of ‘readme’ in the '/dataset/apollo folder'.
The label_image data downloaded directly from apollo cannot be used as training data, although Baidu declares that these training data are created according to cityspace data. The label image uses RGB three channels to distinguish categories instead of grayscale images, so I provided a script ‘color2TrainId.py’ to convert RGB images into grayscale images as defined by the training ID in 'laneMarkDetection.py'.
# from /datasets/apollo/lane_segmentation/
python color2TrainIdLabelImgs.py
The default number of threads for multithreading in the 'color2TrainIdLabelImgs.py' is 10. You can modify the number of threads in the script according to your cpu. After the script finishes running, the annotation image for training will be generated under the labeimg folder.
The script 'build_apollo.py' is modified from 'build_cityscapes.py'. The script will read the images from the '/colorimg' and ‘/labelimg' folders and package them into TFRecord. It should be noted that only the image after the 'color2trainID.py' conversion can be packaged.
# from /datasets/apollo/
python build_apollo_data.py
The cityspaecs dataset does not have road markings, but it is trained in a similar urban scene to the project, so I used the pre-training model provided by it to get better results.
download link:download.tensorflow.org/models/deeplabv3_cityscapes_train_2018_02_06.tar.gz
If you need to see the results in a short period of time, you can use the pre-training model I provided to speed up model convergence.
Please download the pre-training model by following the method in 'readme' in the '/datasets/apollo/exp' folder.
The calculation of loss is defined in the '/utils/train_utils.py' script. You can edit the weights according to your needs and set higher weights for more important objects.
If you want to get an exact number by calculation, then you can refer to the calculation method in E-net, although that method only considers the distribution of the data and does not consider the difficulty of learning objects.
scaled_labels = tf.reshape(scaled_labels, shape=[-1])
loss_weight0 = 1.5
loss_weight1 = 2.3
loss_weight2 = 2.5
....
loss_weight32 = 4
loss_weight33 = 12
loss_weight34 = 4
loss_weight35 = 4
loss_weight_ignore = 0
not_ignore_mask = tf.to_float(tf.equal(scaled_labels, 0)) * loss_weight0 + \
tf.to_float(tf.equal(scaled_labels, 1)) * loss_weight1 + \
....
tf.to_float(tf.equal(scaled_labels, 35)) * loss_weight35 + \
tf.to_float(tf.equal(scaled_labels, ignore_label)) * loss_weight_ignore
- If you want to start training from the beginning, then the learning rate needs to be set larger to ensure that the network parameters can be adjusted at a faster speed in the early stages of training. Since the pre-training model we used is similar to this project scenario, it is recommended to set the learning rate to be '.005'.
- The 'model_variant' parameter can also choose many other models.
- Parameter 'atrous_rates' sets the atrous convolution size. If you have larger GPU memory, you can set it to '8/16/32' and 'out_put_stride=8'. This will make the network get more receptive field.
- The parameter 'train_crop_size' needs to be set to a multiple of 4 plus one. If you want better results, the parameter should be above 325.
CUDA_VISIBLE_DEVICES=0 \
python deeplab/train.py \
--logtostderr \
--num_clones=1 \
--task=0 \
--learing_policy=poly \
--base_learning_rate=.005 \
--learing_rate_decay_factor=0.1\
--learing_rate_decay_step=2000 \
--training_bumber_of_steps=200000 \
--train_spilt="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=341 \
--train_crop_size=341 \
--train_bath_size=2 \
--dataset="apollo" \
--tf_initial_checkpoint='/home/zgx010/TensorflowModels/models/research/deeplab/backbone/deeplabv3_cityscapes_train/model.ckpt' \
--train_logdir='/home/zgx010/TensorflowModels/models/research/deeplab/datasets/apollo/exp/train_on_train_set/train_200000' \
--dataset_dir='/home/zgx010/TensorflowModels/models/research/deeplab/datasets/apollo/tfrecord'
You can use the tensorboard to observe and contrast during the training session.
#form /datasets/apollo/exp/train_on_train_set/
tensorboar --log_dir=./datasets/apollo/exp/train_on_train_set
- If you downloaded the pre-training model, then you need to change the parameter 'tf_initial_checkpoint' to the address of the downloaded model.
- Set base_learning_rate to '.001' and 'training_bumber_of_steps' to '10000'.
CUDA_VISIBLE_DEVICES=0 \
python deeplab/train.py \
--logtostderr \
--num_clones=1 \
--task=0 \
--learing_policy=poly \
--base_learning_rate=.001 \
--learing_rate_decay_factor=0.1\
--learing_rate_decay_step=2000 \
--training_bumber_of_steps=10000 \
--train_spilt="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=341 \
--train_crop_size=341 \
--train_bath_size=2 \
--dataset="apollo" \
--tf_initial_checkpoint='/home/zgx010/TensorflowModels/models/research/deeplab/backbone/deeplabv3_cityscapes_train/model.ckpt' \
--train_logdir='/home/zgx010/TensorflowModels/models/research/deeplab/datasets/apollo/exp/train_on_train_set/train_200000' \
--dataset_dir='/home/zgx010/TensorflowModels/models/research/deeplab/datasets/apollo/tfrecord'
CUDA_VISIBLE_DEVICES=1 \
python deeplab/vis.py \
--logtostderr \
--vis_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--vis_crop_size=2710 \
--vis_crop_size=3384 \
--dataset="apollo" \
--colormap_type="apollo" \
--checkpoint_dir='/home/zgx010/TensorflowModels/models/research/deeplab/datasets/apollo/exp/train_on_train_set/train_200000' \
--vis_logdir='/home/zgx010/TensorflowModels/models/research/deeplab/datasets/apollo/exp/train_on_train_set/vis_train_200000' \
--dataset_dir='/home/zgx010/TensorflowModels/models/research/deeplab/datasets/apollo/tfrecord'
CUDA_VISIBLE_DEVICES=0 \
python deeplab/eval.py \
--logtostderr \
--eval_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--eval_crop_size=2710 \
--eval_crop_size=3384 \
--dataset="apollo" \
--checkpoint_dir='/home/zgx010/TensorflowModels/models/research/deeplab/datasets/apollo/exp/train_on_train_set/train_110000' \
--eval_logdir='/home/zgx010/TensorflowModels/models/research/deeplab/datasets/apollo/exp/train_on_train_set/eval_110000' \
--dataset_dir='/home/zgx010/TensorflowModels/models/research/deeplab/datasets/apollo/tfrecord'
python deeplab/export_model.py \
--logtostderr \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--crop_size= \
--crop_size= \
--checkpoint_path='/home/zgx010/TensorflowModels/models/research/deeplab/datasets/apollo/exp/train_on_train_set/train_110000' \
--export_dir='/home/zgx010/TensorflowModels/models/research/deeplab/datasets/apollo/exp/train_on_train_set/export_train_110000' \