We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I want to know how the performance of CLBLAS on AMD embeded APU, such as RX-416GD.
I tested clBLAS on AMD RX-416GD, the performance of sgemm only up to 123GFlops. However, the peak performance of RX-416GD(GPU) is 480GFlops.
I want to know why the performance of sgemm is low or some mistakes occurs when I install clblas.
Thanks
The text was updated successfully, but these errors were encountered:
clBLAS is not tuned or optimized for embedded APU but for discrete GPU like R9 Fury Nano. I do not think any mistakes here.
Sorry, something went wrong.
No branches or pull requests
I want to know how the performance of CLBLAS on AMD embeded APU, such as RX-416GD.
I tested clBLAS on AMD RX-416GD, the performance of sgemm only up to 123GFlops. However, the peak performance of RX-416GD(GPU) is 480GFlops.
I want to know why the performance of sgemm is low or some mistakes occurs when I install clblas.
Thanks
The text was updated successfully, but these errors were encountered: