Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use multi gpu? #20

Open
yifan123 opened this issue Jun 22, 2021 · 0 comments
Open

How to use multi gpu? #20

yifan123 opened this issue Jun 22, 2021 · 0 comments

Comments

@yifan123
Copy link

How to use multi gpu?

The most time-consuming part of the code is the load part of the data set, and I want to use multiple GPUs to speed it up.

Cuda

Cuda will be used by default if it is available. When training on large portions of the dataset, multiple GPUs is favorable.

SIMULATOR_GPU_IDS: [0,1]  # Each GPU runs NUM_ENVIRONMENTS environments
TORCH_GPU_ID: 0
NUM_ENVIRONMENTS: 1

I followed the instructions in the readme. After correcting, it showed repeated training in the log, which seems to be wrong.

Does the code really support multiple GPUs?

2021-06-22 01:49:53,897 [Epoch: 1/15] [Batch: 1/6568] [BatchTime: 118.16s] [EpochTime: 118s] [Loss: 1.7911]
2021-06-22 01:49:54,260 [Epoch: 1/15] [Batch: 1/6568] [BatchTime: 118.52s] [EpochTime: 119s] [Loss: 1.7911]                   
2021-06-22 01:49:55,733 [Epoch: 1/15] [Batch: 2/6568] [BatchTime: 1.83s] [EpochTime: 120s] [Loss: 1.7777] 
2021-06-22 01:49:56,771 [Epoch: 1/15] [Batch: 2/6568] [BatchTime: 2.51s] [EpochTime: 121s] [Loss: 1.7777] 
2021-06-22 01:49:56,820 [Epoch: 1/15] [Batch: 3/6568] [BatchTime: 1.09s] [EpochTime: 121s] [Loss: 1.7504] 
2021-06-22 01:49:57,925 [Epoch: 1/15] [Batch: 3/6568] [BatchTime: 1.15s] [EpochTime: 122s] [Loss: 1.7504] 
2021-06-22 01:50:01,319 [Epoch: 1/15] [Batch: 4/6568] [BatchTime: 4.5s] [EpochTime: 126s] [Loss: 1.6842] 
2021-06-22 01:50:02,464 [Epoch: 1/15] [Batch: 4/6568] [BatchTime: 4.54s] [EpochTime: 127s] [Loss: 1.6842]
2021-06-22 01:50:03,281 [Epoch: 1/15] [Batch: 5/6568] [BatchTime: 1.96s] [EpochTime: 128s] [Loss: 1.5942] 
2021-06-22 01:50:04,933 [Epoch: 1/15] [Batch: 5/6568] [BatchTime: 2.47s] [EpochTime: 129s] [Loss: 1.5942] 
2021-06-22 01:50:08,206 [Epoch: 1/15] [Batch: 6/6568] [BatchTime: 4.92s] [EpochTime: 132s] [Loss: 1.4861] 
2021-06-22 01:50:09,176 [Epoch: 1/15] [Batch: 6/6568] [BatchTime: 4.24s] [EpochTime: 133s] [Loss: 1.4861]
2021-06-22 01:50:12,253 [Epoch: 1/15] [Batch: 7/6568] [BatchTime: 4.05s] [EpochTime: 137s] [Loss: 1.3834]
2021-06-22 01:50:12,607 [Epoch: 1/15] [Batch: 7/6568] [BatchTime: 3.43s] [EpochTime: 137s] [Loss: 1.3834] 
2021-06-22 01:50:13,432 [Epoch: 1/15] [Batch: 1/6568] [BatchTime: 114.51s] [EpochTime: 115s] [Loss: 1.7911] 
2021-06-22 01:50:15,151 [Epoch: 1/15] [Batch: 1/6568] [BatchTime: 116.23s] [EpochTime: 116s] [Loss: 1.7911] 
2021-06-22 01:50:15,406 [Epoch: 1/15] [Batch: 2/6568] [BatchTime: 1.81s] [EpochTime: 116s] [Loss: 1.7777]   
2021-06-22 01:50:15,726 [Epoch: 1/15] [Batch: 1/6568] [BatchTime: 116.8s] [EpochTime: 117s] [Loss: 1.7911] 
2021-06-22 01:50:15,832 [Epoch: 1/15] [Batch: 1/6568] [BatchTime: 116.91s] [EpochTime: 117s] [Loss: 1.7911]
2021-06-22 01:50:17,043 [Epoch: 1/15] [Batch: 3/6568] [BatchTime: 1.64s] [EpochTime: 118s] [Loss: 1.7504] 
2021-06-22 01:50:17,344 [Epoch: 1/15] [Batch: 8/6568] [BatchTime: 5.09s] [EpochTime: 142s] [Loss: 1.2995] 
2021-06-22 01:50:18,344 [Epoch: 1/15] [Batch: 2/6568] [BatchTime: 2.51s] [EpochTime: 119s] [Loss: 1.7777] 
2021-06-22 01:50:18,430 [Epoch: 1/15] [Batch: 8/6568] [BatchTime: 5.81s] [EpochTime: 143s] [Loss: 1.2995] 
2021-06-22 01:50:18,438 [Epoch: 1/15] [Batch: 2/6568] [BatchTime: 3.29s] [EpochTime: 120s] [Loss: 1.7777] 
2021-06-22 01:50:18,953 [Epoch: 1/15] [Batch: 2/6568] [BatchTime: 3.23s] [EpochTime: 120s] [Loss: 1.7777] 
2021-06-22 01:50:19,627 [Epoch: 1/15] [Batch: 3/6568] [BatchTime: 1.28s] [EpochTime: 121s] [Loss: 1.7504]  
2021-06-22 01:50:19,973 [Epoch: 1/15] [Batch: 3/6568] [BatchTime: 1.53s] [EpochTime: 121s] [Loss: 1.7504]  
2021-06-22 01:50:20,715 [Epoch: 1/15] [Batch: 3/6568] [BatchTime: 1.76s] [EpochTime: 122s] [Loss: 1.7504] 
2021-06-22 01:50:21,385 [Epoch: 1/15] [Batch: 4/6568] [BatchTime: 4.34s] [EpochTime: 122s] [Loss: 1.6842]
2021-06-22 01:50:23,499 [Epoch: 1/15] [Batch: 9/6568] [BatchTime: 6.15s] [EpochTime: 148s] [Loss: 1.2126] 
2021-06-22 01:50:23,876 [Epoch: 1/15] [Batch: 4/6568] [BatchTime: 4.25s] [EpochTime: 125s] [Loss: 1.6842] 
2021-06-22 01:50:25,250 [Epoch: 1/15] [Batch: 9/6568] [BatchTime: 6.82s] [EpochTime: 150s] [Loss: 1.2126]   
2021-06-22 01:50:25,257 [Epoch: 1/15] [Batch: 5/6568] [BatchTime: 3.4s] [EpochTime: 126s] [Loss: 1.5942]  
2021-06-22 01:50:26,637 [Epoch: 1/15] [Batch: 5/6568] [BatchTime: 2.76s] [EpochTime: 128s] [Loss: 1.5942] 
2021-06-22 01:50:26,908 [Epoch: 1/15] [Batch: 10/6568] [BatchTime: 3.41s] [EpochTime: 151s] [Loss: 1.1721]  
2021-06-22 01:50:27,124 [Epoch: 1/15] [Batch: 4/6568] [BatchTime: 7.15s] [EpochTime: 128s] [Loss: 1.6842] 
2021-06-22 01:50:27,997 [Epoch: 1/15] [Batch: 4/6568] [BatchTime: 6.14s] [EpochTime: 129s] [Loss: 1.6842] 
2021-06-22 01:50:28,676 [Epoch: 1/15] [Batch: 10/6568] [BatchTime: 3.42s] [EpochTime: 153s] [Loss: 1.1721] 
2021-06-22 01:50:31,479 [Epoch: 1/15] [Batch: 5/6568] [BatchTime: 2.48s] [EpochTime: 133s] [Loss: 1.5942]  
2021-06-22 01:50:31,666 [Epoch: 1/15] [Batch: 5/6568] [BatchTime: 4.54s] [EpochTime: 133s] [Loss: 1.5942] 
2021-06-22 01:50:31,779 [Epoch: 1/15] [Batch: 6/6568] [BatchTime: 6.52s] [EpochTime: 133s] [Loss: 1.4861]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant