-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama cpp build update to correct HIP backend #206
Comments
Thanks for noticing this, I am fixing this now and also updating to latest llama.cpp for now. Another thing that would proably be good to do is to check selected list of GPU's have only APU/APUS and no dicrete gpus. |
Should now be fixed, I checked that the code goes now on cuda files to HIP specific if blocks. |
Yes, it is fine now. Thanks. Regarding the UMA, yes definitely it would be good to do that too. Let me know I can take a crack at it. Again this is an awesome builder and many thanks. I have two other suggestions which could be features on their own.
|
I think the current APU gpu list is: gfx1135, gfx1036, gfx1103, gfx1150 and gfx1151
|
@karthikbabuks I integrated the llama-cpp-python, you should be able to get it now ./babs.sh -up I created a separate issue for instructlab with manual build instructions on #207 |
@lamikr , Thank you very much. I will check on this. LLAM_CPP_LIB_PATH is used only in runtime and it is not required during build. Thanks for considering instruct lab, let also check and share my experience there. For UMA, I will check on that too. Regarding Ollama, I have basically used it briefly but do know quite a few who use it. It is easy to get it up and running so at the moment I would say good to have but not a priority unless someone is looking for it. |
I added the LLAM_CPP_LIB_PATH now to binfo/env/env_rocm_template.sh that is used by b You should now be able get it by ./babs.sh -up I tested that it was working now at least for launching the llama.cpp with openai compatible mode for deepseek-r1 model. I also added the "open webui" and then configured that to connect to this local llama.cpp server instance instead of openai's webpage and it worked well. There are also some other small updates, I for example updated the python 3.11 to it's latest version. UMA thing would be really great, I will now hope to get release out soon and then jump to porting to never version of rocm-base. I put some discussion from that to #208 |
llamma_cpp has update the HIP backend from "-DGGML_HIPBLAS=1" to "-DGGML_HIP=ON". This results in llama_cpp built with CPU backend and does not leverage ROCm GPU backend. This needs to be updated in the build info of llama_cpp.
The text was updated successfully, but these errors were encountered: