-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path.env.example
39 lines (33 loc) · 2.3 KB
/
.env.example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# We're gonna need to write to disk at some point, this is where we'll store our artifacts.
DATA_PATH=./data
# Aleph VMs are based on customized linux runtimes in order to provide access
# to Aleph APIs + quicker initialization times.
# These runtimes exist as SquashFS images on IPFS. We'll need to pull and build our Engine
# within this runtime in order to ensure compatibility with Aleph.
# In this example, we'll use a runtime based on Debian, whoch is suitable for running LLM engines.
# TODO: we should be able to find the runtime cid from the id
# but for now we'll just hardcode it
RUNTIMES_DIR_PATH=${DATA_PATH}/runtime
# Identifier of the runtime we'll be using
RUNTIME_ID=bd79839bf96e595a06da5ac0b6ba51dea6f7e2591bb913deccded04d831d29f4
# CID of the runtime we'll be using
# NOTE: we strictly don't need this, but im having trouble getting aleph message get to work
RUNTIME_CID=bafybeiecr7yi5imzzf5oucxqidddl5mg55zjiz2axusnh5vgmacfhrrqoe
# We'll need to pull a model to our local machine in order to build our VM.
# We'll rely on hugging face to source our models
# In this exmaple we'll deploy the latest NousHermes model from Hugging Face
# NOTE: it's important that this model is compatible with llama.cpp! GGUF models are a good bet.
MODELS_DIR_PATH=${DATA_PATH}/models
MODEL_REPO="NousResearch/Hermes-2-Pro-Mistral-7B-GGUF"
MODEL_FILE="Hermes-2-Pro-Mistral-7B.Q4_K_M.gguf"
# Lastly we need to specify what engine we want to run within the VM.
# We'll use LlamaCPP for this example.
# We'll use the most up to date version of the LlamaCPP engine in this example.
LLM_ENGINE_BUILDS_DIR_PATH=${DATA_PATH}/engine-builds
LLM_ENGINE_REPO_URL=https://github.com/ggerganov/llama.cpp.git
LLM_ENGINE_REPO_VERSION=master
LLM_ENGINE_BUILD_COMMAND='make server CXXFLAGS="-mno-avx512f -mno-avx512pf -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512cd -mno-avx512er -mno-avx512ifma -mno-avx512vbmi"'
# NOTE: our use of /opt/code as that is the path where we'll be mounting our engine in the VM
# NOTE: our last argument to this is a path to the model we want to use with the engine. This will be mounted in the VM as well.
# The model name within the mount will be the same as the model file name -- but you don't need to worry about that.
LLM_ENGINE_RUN_COMMAND='/opt/code/server --host 0.0.0.0 --port 8080 --mlock -cb --log-disable -m'