Support InternLM3 Dense 8B Model #6640

hhaAndroid · 2025-01-14T09:15:17Z

Support InternLM3 Dense 8B Model.

Create a new file examples/train_full/internlm3_full_sft.yaml with the following content:

### model
model_name_or_path: internlm/internlm3-8b-instruct
trust_remote_code: true

### method
stage: sft
do_train: true
finetuning_type: full
deepspeed: examples/deepspeed/ds_z3_config.json  # choices: [ds_z0_config.json, ds_z2_config.json, ds_z3_config.json]

### dataset
dataset: alpaca_en,alpaca_zh
template: intern3
cutoff_len: 4096
max_samples: 10000
overwrite_cache: true
preprocessing_num_workers: 16

### output
output_dir: saves/interlm3/full/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true

### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 1
learning_rate: 1.0e-6
num_train_epochs: 1.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000

### eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 5000000000

# 1 gpu
DISABLE_VERSION_CHECK=1 llamafactory-cli train examples/train_full/internlm3_full_sft.yaml
# 1 node
DISABLE_VERSION_CHECK=1 FORCE_TORCHRUN=1 llamafactory-cli train examples/train_full/internlm3_full_sft.yaml

Note: Internlm3 dense 8b only supports transformers==4.47.1 for now, it is necessary to specify DISABLE_VERSION_CHECK=1.

hiyouga

Thanks for your nice integration, we have left a comment about the template class, could you please resolve it?

src/llamafactory/data/template.py

hhaAndroid · 2025-01-14T09:43:39Z

@hiyouga Hello, Why is it no longer necessary to specify efficient_eos=True? and Is there a suitable place to inform users that they must use version 4.47.1? Thank you.

hiyouga

LGTM

hiyouga · 2025-01-14T09:51:57Z

@hiyouga Hello, Why is it no longer necessary to specify efficient_eos=True? and Is there a suitable place to inform users that they must use version 4.47.1? Thank you.

@hhaAndroid The refactored template does not need to change the eos token, instead it uses format_assistant to properly apply chat template.
To notify users to use the newer version, we can let the framework print a warning message in the patcher like

from ..extras.packages import is_transformers_version_greater_than

if getattr(config, "model_type", None) == "internlm3" and not is_transformers_version_greater_than("4.47.1"):
    logger.warning_rank0_once("InternLM3 model requires transformers >= 4.47.1, please upgrade it.")

Or raise an exception:

if getattr(config, "model_type", None) == "internlm3" and not is_transformers_version_greater_than("4.47.1"):
    raise RuntimeError("InternLM3 model requires transformers >= 4.47.1, please upgrade it.")

LLaMA-Factory/src/llamafactory/model/patcher.py

Lines 105 to 108 in 58d029f

    
           if getattr(config, "model_type", None) == "qwen": 
        
               setattr(config, "use_flash_attn", model_args.flash_attn == "fa2") 
        
               for dtype_name, dtype in [("fp16", torch.float16), ("bf16", torch.bfloat16), ("fp32", torch.float32)]: 
        
                   setattr(config, dtype_name, model_args.compute_dtype == dtype)

hiyouga

LGTM

sebm123 · 2025-01-14T13:01:17Z

Support InternLM3 Dense 8B Model.

Create a new file examples/train_full/internlm3_full_sft.yaml with the following content:

### model
model_name_or_path: internlm/internlm3-8b-instruct
trust_remote_code: true

### method
stage: sft
do_train: true
finetuning_type: full
deepspeed: examples/deepspeed/ds_z3_config.json  # choices: [ds_z0_config.json, ds_z2_config.json, ds_z3_config.json]

### dataset
dataset: alpaca_en,alpaca_zh
template: intern3
cutoff_len: 4096
max_samples: 10000
overwrite_cache: true
preprocessing_num_workers: 16

### output
output_dir: saves/interlm3/full/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true

### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 1
learning_rate: 1.0e-6
num_train_epochs: 1.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000

### eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 5000000000

# 1 gpu
DISABLE_VERSION_CHECK=1 llamafactory-cli train examples/train_full/internlm3_full_sft.yaml
# 1 node
DISABLE_VERSION_CHECK=1 FORCE_TORCHRUN=1 llamafactory-cli train examples/train_full/internlm3_full_sft.yaml

Note: Internlm3 dense 8b only supports transformers==4.47.1 for now, it is necessary to specify DISABLE_VERSION_CHECK=1.

* support internlm3 * update * update * update * add hint

hhaAndroid added 4 commits January 9, 2025 12:48

support internlm3

Verified

This commit was signed with the committer’s verified signature.

jonas-jonas Jonas Hungershausen

SSH Key Fingerprint: nWYVLJwFaMZ2XLkacqMw5XwJBeZBJPamdtETUpBO9dE
Verified
Learn about vigilant mode

b723e28

update

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.

Learn about vigilant mode

ecff4f0

update

c372dcf

fix conflicts

9f8343f

hiyouga reviewed Jan 14, 2025

View reviewed changes

src/llamafactory/data/template.py Outdated Show resolved Hide resolved

update

Loading
Loading status checks…

643b823

hiyouga approved these changes Jan 14, 2025

View reviewed changes

hhaAndroid temporarily deployed to tests January 14, 2025 09:44 — with GitHub Actions Inactive

hhaAndroid temporarily deployed to tests January 14, 2025 09:45 — with GitHub Actions Inactive

add hint

Loading
Loading status checks…

ffeafc2

hiyouga approved these changes Jan 14, 2025

View reviewed changes

hhaAndroid temporarily deployed to tests January 14, 2025 10:02 — with GitHub Actions Inactive

hiyouga merged commit deacc00 into hiyouga:main Jan 14, 2025
12 checks passed

hiyouga added the solved label Jan 14, 2025

1587causalai pushed a commit to 1587causalai/llama_factory that referenced this pull request Feb 18, 2025

Support InternLM3 Dense 8B Model (hiyouga#6640)

71b3c3c

* support internlm3 * update * update * update * add hint

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support InternLM3 Dense 8B Model #6640

Support InternLM3 Dense 8B Model #6640

hhaAndroid commented Jan 14, 2025 •

edited

Loading

hiyouga left a comment

hhaAndroid commented Jan 14, 2025 •

edited

Loading

hiyouga left a comment

hiyouga commented Jan 14, 2025 •

edited

Loading

hiyouga left a comment

sebm123 commented Jan 14, 2025

Support InternLM3 Dense 8B Model #6640

Support InternLM3 Dense 8B Model #6640

Conversation

hhaAndroid commented Jan 14, 2025 • edited Loading

hiyouga left a comment

Choose a reason for hiding this comment

hhaAndroid commented Jan 14, 2025 • edited Loading

hiyouga left a comment

Choose a reason for hiding this comment

hiyouga commented Jan 14, 2025 • edited Loading

hiyouga left a comment

Choose a reason for hiding this comment

sebm123 commented Jan 14, 2025

hhaAndroid commented Jan 14, 2025 •

edited

Loading

hhaAndroid commented Jan 14, 2025 •

edited

Loading

hiyouga commented Jan 14, 2025 •

edited

Loading