Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating high-performance OpenCL kernels #2

Open
ghost opened this issue Jun 20, 2017 · 2 comments
Open

Generating high-performance OpenCL kernels #2

ghost opened this issue Jun 20, 2017 · 2 comments

Comments

@ghost
Copy link

ghost commented Jun 20, 2017

Dear Lift-developers,
I would like to gain experience with Lift. I was able to install Lift by following the instructions at http://lift-project.readthedocs.io/en/latest/. All unit tests were succesfull. I have now written a couple of high-level expressions but don't know how to generate high-performance OpenCL kernels from these. I couldn't find any documentation on this and would greatly appreciate your support.

@ghost
Copy link
Author

ghost commented Jul 3, 2017

I noticed you extended the documentation and now provide information on how to generate and execute kernels. Thank you! I tried the steps and have the following questions:

  1. The provided steps seem to be specific for matrix multiplication and a few other BLAS routines. How can Lift be used for arbitrary programs (“high-level expressions” in Lift terminology) that are not BLAS?
  2. I ran into an issue with the time required for executing the generated kernels. Executing the last command
    for i in `seq 1 250`; do find . -mindepth 1 -type d -exec sh -c '(cd {} && timeout 5m harness_mm -k 1024 -n 1024 -m 1024 --transpose-A -d $DEVICE -p $PLATFORM)' ';'; done
    took more than 48h to execute and exceeds the time available to me on the cluster I am using. Is there a way to break this step down into smaller tasks, that can be executed successively?

Many thanks in advance.

@tremmelg
Copy link
Member

For 2, the command just finds all the subfolders containing OpenCL kernels and executes them. You can either divide the subfolders into several folders and run them one by one or just run the command again, as it will skip kernels it has already run.

michel-steuwer pushed a commit that referenced this issue Jun 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant