Query regarding Performance of ssd_mobilenet in Xavier #73

Niran89 · 2019-07-24T10:28:09Z

Dear naisy,

Thank you for your work. I am currently working in porting ssd mobilenet Object Detection algorithm in Xavier platform.

The current configuration used in Xavier,

Jetpack - 4.1
Tensorflow - 1.12.0 with gpu support
TensorRT - 5.0.3 with cuda 10

I have set up your project in Xavier and I am able to run it successfully. I have made the below observations w.r.t the performance by running a video stream (1280*720) as input. PFA.

Note : During all these experiments, the Xavier is set to Max-N mode.

Can you help me to get a clarity on the following queries,

Is my observation on the performance correct? Is this the expected performance in Xavier ? I have refered the "Current Max Performance of ssd_mobilenet_v1_coco_2018_01_28" table given in the README.md
From the table I understand that in "Xavier with Max-N mode with visualiztion, for input of 1280x720, the FPS should be 48fps". But I have got only 35fps (nms_v2,ssd_mobilenet_v1_coco_2018_01_28).
I am not sure why there is almost 10+fps drop when compared to the expected one. I havent made any changes in the code.
As per my understanding, TRT model is supposed to perform better than the normal TF model. But from my above observation, I see a drop in fps by using TRT (comparison: trt_v1 vs nms_v2)
From the overall understanding, nms_v2 is performing better than nms_v1 and trt_v1. What is the major difference between nms_v1, nms_v2, trt_v1

Appreciate your help.

Regards,
Niran

naisy · 2019-07-25T03:25:22Z

Hi @Niran89,

About FPS
Please try jetson_clocks.sh.
This is probably the reason why 10FPS is also slow.

MAX-N mode is the setting of the maximum number of cores and clock(Hz) of CPU and GPU.
This setting will keep after rebooting.

jetson_clocks.sh sets the CPU and GPU to the maximum clock.
This setting will not keep after reboot.
The default kernel will boot in low clock mode each time.

About TF-TRT model
Probably because my code is old.
The old TF-TRT was slow due to overhead. The recent TF-TRT seems to be fast.
I would like to see it when I have time.
About nms_v1, nms_v2, trt_v1
ssd_mobilenet_v1_coco_2017_11_17(nms_v1) and ssd_mobilenet_v1_coco_2018_01_28(nms_v2) differ in the non-maximum suppression part. This node targeted by split model.
Similarly, trt_v1 is also a different node.

split model of nms_v1 targets Postprocessor/convert_scores and Postprocessor/ExpandDims_1.
nms_v2 targets Postprocessor/Slice, Postprocessor/ExpandDims_1 and Postprocessor/stack_1.
trt_v1 targets Postprocessor/Slice and Postprocessor/ExpandDims_1.

These models have the same ssd mobilenet v1 part, but non-maximum suppression part is different.

About split model:
https://github.com/naisy/realtime_object_detection/blob/master/About_Split-Model.md

Niran89 · 2019-09-03T10:14:23Z

Hi Naisy,

Thank you for the reply and sorry for the delayed response from my side.

FYI, I have already set the clocks in Xavier using "sudo ./jetson_clocks.sh". But even after setting the clock and running the Xavier in Max-N mode, I still see 10 fps drop while running "ssd_mobilenet_v1_coco_2018_01_28" model with the car image (544x288 resolution) you have used.

From your github, I understand that you have achieved almost 52fps in Xavier with the car image. But I could attain only 40fps. Can you let me know what I am missing more. Below is the configuration I have used before running the model,

In config.yml:
model_type: 'nms_v2'
model_path: 'models/ssd_mobilenet_v1_coco_2018_01_28/frozen_inference_graph.pb'
force_gpu_compatible: False/True (Tried with both)
visualize: True
width: 544
height: 288
split_model: True
split_shape: 1917

Other configurations are still same as yours. Once these configurations are set, I ran
python run_image.py , after which I observed 39 fps in Xavier.

My current Setup in Xavier,

Jetpack - 4.1
Python - 2.7
Opencv - 3.4.2
Tensorflow - 1.12.0

Appreciate your help.

Thanks & Regards,
Niran

naisy · 2019-09-03T10:49:13Z

Hi @Niran89,

run_image.py and run_video.py are slower than run_stream.py.
The reason is that the method of frame reading processing is different.
#52 (comment)

Can you try with usb webcam?

Niran89 · 2019-09-04T02:32:39Z

Hi @naisy ,

Thank you once again...

As suggested, I ran the algorithm with a stream from a USB camera. But still I could see 10fps drop. The configuration I used is,

Hardware : Xavier
code : run_stream.py
width : 1280
height : 720
visualize : True
force_compatible_gpu : False
Max mode : Max-N mode and clock is also set

From your performance table, for the above configuration, the FPS is supposed to be 48 in Xavier.
But i got only around 39fps. Is there anything I am missing out still..

Note : I tried with multiple objects in the scene. For any number of objects and even with no objects in the scene, the fps is still 39.

Appreciate your help...

Thanks & Regards,
Niran

naisy · 2019-09-08T13:20:42Z

Hi @Niran89,

Can you show tegrastats?

Niran89 · 2019-09-11T02:32:45Z

Hi @naisy ,

PFA for the tegrastats recorded while running the object detection code in Xavier. The configuration is same as above. Also you could see the FPS recorded in the same image.

Thanks & Regards,
Niran

naisy · 2019-09-15T07:43:23Z

Hi @Niran89,

Looking at the tegrastats results, it seems that both the GPU and the CPU are doing enough work.
But 36.4fps is too slow.

If you are running over the network, it will be slower. For example,
ssh -C -Y ubuntu@xavier_ip_address
Other than that, I don't know why it is so slow, sorry.

Niran89 · 2019-09-16T02:16:09Z

Hi @naisy ,

Thank you for the reply.

I am sure that the code is not run over the network as you mentioned above. I am directly running it as mentioned in your README file.

Anyways thanks for your support so far. Kindly let me know if you get any pointers regarding the performance drop in future. Looking forward.

Thanks & Regards,
Niran

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query regarding Performance of ssd_mobilenet in Xavier #73

Query regarding Performance of ssd_mobilenet in Xavier #73

Niran89 commented Jul 24, 2019

naisy commented Jul 25, 2019

Niran89 commented Sep 3, 2019

naisy commented Sep 3, 2019

Niran89 commented Sep 4, 2019

naisy commented Sep 8, 2019

Niran89 commented Sep 11, 2019

naisy commented Sep 15, 2019

Niran89 commented Sep 16, 2019

Query regarding Performance of ssd_mobilenet in Xavier #73

Query regarding Performance of ssd_mobilenet in Xavier #73

Comments

Niran89 commented Jul 24, 2019

naisy commented Jul 25, 2019

Niran89 commented Sep 3, 2019

naisy commented Sep 3, 2019

Niran89 commented Sep 4, 2019

naisy commented Sep 8, 2019

Niran89 commented Sep 11, 2019

naisy commented Sep 15, 2019

Niran89 commented Sep 16, 2019