-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do you consider model accuracy Issues and how to get the whole trainig time of AI training process? #70
Comments
|
Thanks for your answers, yeal, the total time can indeed be obtained by a simple calculation of the time of a batch. |
emmmmm, SimAI is a simulator and does not actually train models with data, so it does not track model convergence. However, some community members are researching how to predict training convergence metrics for large models, which may be integrated into SimAI in the future. If you're interested in this area, feel free to reach out for further discussion. |
Thanks for your explanation, I'm trying to find the researches on how to predict training convergence metrics for large models. But I didn't find any work about it. Can you provide me with some related work or papers? |
hello, thanks for your excellent work on large scale AI training simulation. And I'm curious if you're considering model accuracy issues. Do the parameters you provide for modification have an impact on model accuracy?
you said it can "Evaluate the time consumption of AI tasks", as far as I'm known, Astra-sim can only get the time of a single batch but not the whole training process(because it is related to the number of epochs, and so on...). So I'm also curious that how do you think about this problem?
The text was updated successfully, but these errors were encountered: