Skip to content

Commit

Permalink
minor fix
Browse files Browse the repository at this point in the history
Signed-off-by: Kaihui-intel <[email protected]>
  • Loading branch information
Kaihui-intel committed Sep 30, 2024
1 parent 6304d17 commit 51407fd
Showing 1 changed file with 8 additions and 6 deletions.
14 changes: 8 additions & 6 deletions docs/source/3x/transformers_like_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Transformers-like API provides a seamless user experience of model compressions
| Intel CPU | &#10004; | &#10004; | &#10004; | &#10004; | &#10004; |
| Intel GPU | &#10004; | stay tuned | stay tuned | &#10004; | &#10004; |

> Please refer to [weight-only quant document](./PT_WeightOnlyQuant.md) for more details.
> Please refer to [weight-only quantization document](./PT_WeightOnlyQuant.md) for more details.

## Usage For CPU
Expand All @@ -36,7 +36,8 @@ from neural_compressor.transformers import AutoModelForCausalLM, RtnConfig
model_name_or_path = "MODEL_NAME_OR_PATH"
woq_config = RtnConfig(bits=4)
q_model = AutoModelForCausalLM.from_pretrained(
model_name_or_path, quantization_config=woq_config,
model_name_or_path,
quantization_config=woq_config,
)

# AWQ
Expand All @@ -57,7 +58,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
woq_config = TeqConfig(bits=4, tokenizer=tokenizer)
q_model = AutoModelForCausalLM.from_pretrained(
model_name_or_path,
quantization_config=woq_config
quantization_config=woq_config,
)

# GPTQ
Expand All @@ -69,7 +70,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
woq_config = GPTQConfig(bits=4, tokenizer=tokenizer)
woq_model = AutoModelForCausalLM.from_pretrained(
model_name_or_path,
quantization_config=woq_config
quantization_config=woq_config,
)

# AutoRound
Expand All @@ -81,7 +82,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
woq_config = AutoRoundConfig(bits=4, tokenizer=tokenizer)
woq_model = AutoModelForCausalLM.from_pretrained(
model_name_or_path,
quantization_config=woq_config
quantization_config=woq_config,
)

# inference
Expand All @@ -104,7 +105,8 @@ from neural_compressor.transformers import AutoModelForCausalLM, RtnConfig
model_name_or_path = "MODEL_NAME_OR_PATH"
woq_config = RtnConfig(bits=4)
q_model = AutoModelForCausalLM.from_pretrained(
model_name_or_path, quantization_config=woq_config,
model_name_or_path,
quantization_config=woq_config,
)

# save quant model
Expand Down

0 comments on commit 51407fd

Please sign in to comment.