Skip to content

Commit

Permalink
Update quantization readme
Browse files Browse the repository at this point in the history
  • Loading branch information
elpham6 committed Apr 17, 2024
1 parent 027f1c9 commit e1543a4
Showing 1 changed file with 8 additions and 9 deletions.
17 changes: 8 additions & 9 deletions llm_quantize/README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
## Model Quantization

This module provides interactive way to quantize your model.
To quantize model:
This module provides an interactive way to quantize your model.
To quantize model, run:
`python -m grag.quantize.quantize`

After running the above command user will be prompted with following:
After running the above command, user will be prompted with the following:

- Path where user want to clone [llama.cpp](!https://github.com/ggerganov/llama.cpp) repo.
- Path where user wants to clone [llama.cpp](!https://github.com/ggerganov/llama.cpp) repo
- If user wants us to download model from [HuggingFace](!https://huggingface.co/models) or user has model downloaded
locally.
- For former, user will be prompted to provide repo path from HuggingFace.
- In case of later, user will be instructed to copy the model and input the name of model directory.
- Finally, user will be prompted to enter quantization (recommended Q5_K_M or Q4_K_M, etc.). Check
more [here](!https://github.com/ggerganov/llama.cpp/blob/master/examples/quantize/quantize.cpp#L19).
locally
- For the former, user will be prompted to provide repo path from HuggingFace
- For the latter, user will be instructed to copy the model and input the name of model directory
- Finally, user will be prompted to enter quantization (recommended Q5_K_M or Q4_K_M, etc.). For more details, check [here](!https://github.com/ggerganov/llama.cpp/blob/master/examples/quantize/quantize.cpp#L19).

0 comments on commit e1543a4

Please sign in to comment.