Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training efficiency #5

Open
kevin-xuan opened this issue May 22, 2022 · 3 comments
Open

Training efficiency #5

kevin-xuan opened this issue May 22, 2022 · 3 comments

Comments

@kevin-xuan
Copy link

Hi, I'm interested in your work and appreciate the sharing of source code. I have some questions.
First, I run MT10-Conditioned task, I find that the time consumption is average 200s per epoch, meaning that we need 18 days to perform all 7500 epoches. Moreover, I also run MT50-Fixed task, the consumption is average 2500s per epoch. And you use multiple-processing technique, even the policy network and Q-function network is deployed in GPU, these networks only consume 1.5G GPU memory. Is it normal training speed?
Second, you use multiple-processing technique to collect data and perform multi-task learning, What is the training process of multi-task? Each time you input a state vector and task id one-hot vector into policy network, it means that the batch size is equal to 1, but you define the batch size as 1280, What is the specific training detail?
Look forward to your reply, thanks!

@RchalYang
Copy link
Owner

Hi as far as i remember, i can train the MT10 within 2~3days. MT50 is a bit longer around a week to get the full result.
For the multitask part, we use multiple-processing and different tasks are using different process, and the batchsize is used for updating not sampling (if i remember it correctly)

@kevin-xuan
Copy link
Author

Thanks for your reply! Could you please tell about the machine you used in the experiment? We use Intel XEON 4210R CPU and RTX 3090 GPU, the main time consumption is spend on CPU rather than GPU. Therefore, the reason may be that my CPU is not efficient?

@zhangt603
Copy link

I also encountered this issue. May I ask how you resolved it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants