Replies: 1 comment
-
Found a thread on reddit with answers to my questions. Also, tried turning off Efficient cores and Hyper-threading – no changes. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Been running inference for
gpt_bigcode-santacoder-ggml.bin q4_1
on two systems:Tried different amount of threads with
./bin/starcoder -m ../models/bigcode/gpt_bigcode-santacoder-ggml.bin -p "def fibonnaci(" --top_k 0 --top_p 0.95 --temp 0.2 -t N
, here is the comparison:I also ran
mbw
tests on both systems:Questions: Why my mac is so much faster? Is it due memory bandwidth difference? (200 GB/s on mac vs 76.8 GB/s on intel). Any chance there is a way to boost inference speed on intel?
Beta Was this translation helpful? Give feedback.
All reactions