We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
用LLaMA-Efficient-Tuning微调1和2,loss均不下降
The text was updated successfully, but these errors were encountered:
在2块4090用deepspeed微调baichuan1和baichuan2开始loss为1.9,几十个step之后就在1.5附近不再下降。 参数如下。 deepspeed --num_gpus 2 --master_port=9901 src/train_bash.py --deepspeed ds_config_stage3.json --stage sft --model_name_or_path /root/autodl-tmp/baichuan-13b-chat/ --do_train --dataset alpaca_gpt4_zh --template baichuan2 --finetuning_type lora --lora_target W_pack --output_dir src/output/sft-0921 --overwrite_cache --per_device_train_batch_size 4 --gradient_accumulation_steps 4 --lr_scheduler_type cosine --logging_steps 10 --save_steps 500 --learning_rate 5e-5 --num_train_epochs 3.0 --plot_loss --bf16 deepspeed配置文件如下。
Sorry, something went wrong.
No branches or pull requests
用LLaMA-Efficient-Tuning微调1和2,loss均不下降
The text was updated successfully, but these errors were encountered: