Skip to content

Releases: 3Simplex/llama.cpp

b4007

01 Nov 16:31
d865d14
Compare
Choose a tag to compare
server : fix smart selection of available slot (#10120)

* Fix smart selection of available slot

* minor fix

* replace vectors of tokens with shorthands

b3987

28 Oct 22:35
61715d5
Compare
Choose a tag to compare
llama : Add IBM granite template (#10013)

* Add granite template to llama.cpp

* Add granite template to test-chat-template.cpp

* Update src/llama.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Update tests/test-chat-template.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Added proper template and expected output

* Small change to \n

Small change to \n

* Add code space &

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Fix spacing

* Apply suggestions from code review

* Update src/llama.cpp

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>

b3959

22 Oct 13:32
c421ac0
Compare
Choose a tag to compare
lora : warn user if new token is added in the adapter (#9948)

b3949

21 Oct 13:46
d5ebd79
Compare
Choose a tag to compare
rpc : pack only RPC structs (#9959)

b3943

20 Oct 14:00
cda0e4b
Compare
Choose a tag to compare
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)

* refactor llama_batch_get_one

* adapt all examples

* fix simple.cpp

* fix llama_bench

* fix

* fix context shifting

* free batch before return

* use common_batch_add, reuse llama_batch in loop

* null terminated seq_id list

* fix save-load-state example

* fix perplexity

* correct token pos in llama_batch_allocr

b3942

18 Oct 15:24
afd9909
Compare
Choose a tag to compare
rpc : backend refactoring (#9912)

* rpc : refactor backend

Use structs for RPC request/response messages

* rpc : refactor server

b3895

07 Oct 18:13
f1af42f
Compare
Choose a tag to compare
Update building for Android (#9672)

* docs : clarify building Android on Termux

* docs : update building Android on Termux

* docs : add cross-compiling for Android

* cmake : link dl explicitly for Android

b3855

01 Oct 14:03
a90484c
Compare
Choose a tag to compare
llama : print correct model type for Llama 3.2 1B and 3B

b3711

09 Sep 13:47
8e6e2fb
Compare
Choose a tag to compare
CUDA: fix variable name conflict for Windows build (#9382)

b3660

03 Sep 17:50
b69a480
Compare
Choose a tag to compare
readme : refactor API section + remove old hot topics