Multiple GPU #63

guods · 2019-08-27T01:53:09Z

Thank you for your work, but I have some questions：
How does the generated engine run on multiple graphics cards? How to set GPU Id number ？

lewes6369 · 2019-09-08T06:58:30Z

It should not be the bottleneck for generating engine.You can save engine only for the first time, and latter you can load from engine file.

guods · 2019-09-09T03:43:36Z

It is not be the bottleneck for generating engine。After create the engine file, I want to run the engine file on the specified GPU，so I set GPU ID by "cudaSetDevice", but it did not work.

guods · 2019-09-09T03:55:52Z

Have you ever done this experiment：For graphics cards with the same architecture, engine files generated under low-profile graphics cards （1060）be used under high-profile graphics cards(1080)?

zerollzeng · 2019-09-10T07:43:08Z

hi @guods, it nice to see you again :)
for your first question:
Each ICudaEngine object is bound to a specific GPU when it is instantiated, either by the builder or on deserialization. To select the GPU, use cudaSetDevice() before calling the builder or deserializing the engine. Each IExecutionContext is bound to the same GPU as the engine from which it was created. When calling execute() or enqueue(), ensure that the thread is associated with the correct device by calling cudaSetDevice() if necessary.

and for the second question:
I recommend that you don’t, however, if you do, you’ll need to follow these guidelines:
The major, minor, and patch versions of TensorRT must match between systems. This ensures you are picking kernels that are still present and have not undergone certain optimizations or bug fixes that would change their behavior.
The CUDA compute capability major and minor versions must match between systems. This ensures that the same hardware features are present so the kernel will not fail to execute. An example would be mixing cards with different precision capabilities.
The following properties should match between systems:
Maximum GPU graphics clock speed
Maximum GPU memory clock speed
GPU memory bus width
Total GPU memory
GPU L2 cache size
SM processor count
Asynchronous engine count
If any of the above properties do not match, you will receive the following warning: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.

guods · 2019-09-10T09:28:57Z

Thanks for you reply. I also read the words in TensorRT document, although the output value of cudaSetDevice()(before create engine) is error, it create the engine and get the correct result, it suggested that it is no use by cudaSetDevice() .

zerollzeng · 2019-09-10T09:50:25Z

It should not return error, emmm... what kind of error did you get?

guods · 2019-09-10T09:58:29Z

I make the error deliberately, I want to know if the engine file is still generated properly even if I set it incorrectly. I set it incorrectly and the file is still generated properly. For ctreating the engine, it
is no use by cudaSetDeviece.

lewes6369 · 2019-09-13T16:01:58Z

I am not sure your issues. As @zerollzeng said, the engine is not generic for different architecture cards. Maybe just try to set CUDA_VISIBLE_DEVICES to the value which graphic card you want to create engine and deploy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple GPU #63

Multiple GPU #63

guods commented Aug 27, 2019

lewes6369 commented Sep 8, 2019

guods commented Sep 9, 2019

guods commented Sep 9, 2019

zerollzeng commented Sep 10, 2019

guods commented Sep 10, 2019

zerollzeng commented Sep 10, 2019

guods commented Sep 10, 2019

lewes6369 commented Sep 13, 2019

Multiple GPU #63

Multiple GPU #63

Comments

guods commented Aug 27, 2019

lewes6369 commented Sep 8, 2019

guods commented Sep 9, 2019

guods commented Sep 9, 2019

zerollzeng commented Sep 10, 2019

guods commented Sep 10, 2019

zerollzeng commented Sep 10, 2019

guods commented Sep 10, 2019

lewes6369 commented Sep 13, 2019