Skip to content

Commit

Permalink
Update get_started.llms.rst
Browse files Browse the repository at this point in the history
  • Loading branch information
sanchitvj authored May 7, 2024
1 parent 9ab317a commit 688bdd1
Showing 1 changed file with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions src/docs/get_started.llms.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,12 @@ After running the above command, user will be prompted with the following:

2. Input the **model path**:

* If user wants to download a model from `HuggingFace <https://huggingface.co/models>`_, the user should provide the repository path from HuggingFace.
* If user wants to download a model from `HuggingFace <https://huggingface.co/models>`_, the user should provide the repository path or URL from HuggingFace.

* If the user has the model downloaded locally, then user will be instructed to copy the model and input the name of the model directory.

3. Finally, the user will be prompted to enter **quantization** settings (recommended Q5_K_M or Q4_K_M, etc.). For more details, check `llama.cpp/examples/quantize/quantize.cpp <https://github.com/ggerganov/llama.cpp/blob/master/examples/quantize/quantize.cpp#L19>`_.
3. The user will be asked where to put the quantized model otherwise it will go in the directory where you downloaded model repository.

4. Finally, the user will be prompted to enter **quantization** settings (recommended Q5_K_M or Q4_K_M, etc.). For more details, check `llama.cpp/examples/quantize/quantize.cpp <https://github.com/ggerganov/llama.cpp/blob/master/examples/quantize/quantize.cpp#L19>`_.

5. Optionally, user can inference the quantized model with the next prompt. This inference will be on CPU so it takes time if model is large one.

0 comments on commit 688bdd1

Please sign in to comment.