diff --git a/src/docs/get_started.llms.rst b/src/docs/get_started.llms.rst index 53f0f21..bfde5ad 100644 --- a/src/docs/get_started.llms.rst +++ b/src/docs/get_started.llms.rst @@ -30,8 +30,12 @@ After running the above command, user will be prompted with the following: 2. Input the **model path**: -* If user wants to download a model from `HuggingFace `_, the user should provide the repository path from HuggingFace. +* If user wants to download a model from `HuggingFace `_, the user should provide the repository path or URL from HuggingFace. * If the user has the model downloaded locally, then user will be instructed to copy the model and input the name of the model directory. -3. Finally, the user will be prompted to enter **quantization** settings (recommended Q5_K_M or Q4_K_M, etc.). For more details, check `llama.cpp/examples/quantize/quantize.cpp `_. +3. The user will be asked where to put the quantized model otherwise it will go in the directory where you downloaded model repository. + +4. Finally, the user will be prompted to enter **quantization** settings (recommended Q5_K_M or Q4_K_M, etc.). For more details, check `llama.cpp/examples/quantize/quantize.cpp `_. + +5. Optionally, user can inference the quantized model with the next prompt. This inference will be on CPU so it takes time if model is large one.