Skip to content

Commit

Permalink
server: bench: init
Browse files Browse the repository at this point in the history
  • Loading branch information
phymbert committed Mar 24, 2024
1 parent d67102c commit 54818d4
Showing 1 changed file with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions .github/workflows/bench.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ jobs:
-DLLAMA_CUBLAS=ON \
-DCUDAToolkit_ROOT=/usr/local/cuda \
-DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc \
-DCMAKE_CUDA_ARCHITECTURES=80 \
-DCMAKE_CUDA_ARCHITECTURES=75 \
-DLLAMA_FATAL_WARNINGS=OFF \
-DLLAMA_ALL_WARNINGS=OFF \
-DCMAKE_BUILD_TYPE=Release;
Expand All @@ -77,7 +77,7 @@ jobs:
id: server_bench
run: |
build/bin/server \
--host localhost \
--host 0.0.0.0 \
--port 8080 \
--hf-repo ggml-org/models \
--hf-file phi-2/ggml-model-q4_0.gguf \
Expand All @@ -95,5 +95,9 @@ jobs:
sleep 0.1
done
while [[ "$(curl -s -o /dev/null -w ''%{http_code}'' localhost:8080)" != "200" ]]; do
sleep 0.5;
done
cd examples/server/bench
../../../k6 run script.js --duration 10m --iterations 500 --vus 8

0 comments on commit 54818d4

Please sign in to comment.