[WIP]feature(mixtral): support Mixtral-8x7B SFT, Reward, and Alignment #95

haolin-nju · 2024-09-24T05:33:15Z

This PR involves:

Support Mixture-of-Experts model and Expert Parallel (EP) in ChatLearn;
Implement an example on Mixtral-8x7B SFT, Reward, RLHF, DPO, and other alignment algorithms.

loofahcus · 2024-10-23T03:26:55Z

examples/megatron/scripts/train_sft_mixtral.sh

+NNODES=$WORLD_SIZE
+
+
+dp=$(($WORLD_SIZE * $GPUS_PER_NODE / $tp / $pp / $ep))


it's not supposed to divide ep here.

loofahcus · 2024-10-23T03:37:35Z

examples/megatron/scripts/train_reward_mixtral.sh

+NNODES=$WORLD_SIZE
+
+
+dp=$(($WORLD_SIZE * $GPUS_PER_NODE / $tp / $pp / $ep))


neither is here.

loofahcus · 2024-10-23T03:43:12Z

examples/megatron/scripts/train_reward_mixtral.sh

+cd ${CHATLEARN}/examples/megatron/alignment/reward
+
+torchrun $DISTRIBUTED_ARGS \
+  finetune_reward.py \


should be entry/train_reward.py

loofahcus · 2024-10-23T03:43:35Z

examples/megatron/scripts/train_sft_mixtral.sh

+cd ${CHATLEARN}/examples/megatron/sft
+
+torchrun $DISTRIBUTED_ARGS \
+  finetune_sft.py \


shold be entry/train_sft.py

haolin-nju added 15 commits September 10, 2024 17:28

Support MCore models

eb698e9

Fix Megatron version

b470622

fix pylint

41db596

fix missing import in reward_inference

636141c

fix vllm

004211c

refine hint

1d64707

fix Makefile

b7367a2

Merge branch 'main' into mixtral

a6fac05

Support Mixtral MoE

7ef59ca

fix mixtral

b2e544e

fix model mixtral

1aa2301

Merge branch 'main' into mixtral

5e4643b

align diff to main

19143f1

fix dpo and add test_checkpoint_conversion

20e6272

fix error msg

f68afd1

haolin-nju changed the title ~~[WIP] Feature: Support Mixtral-8x7B SFT, Reward, Alignment~~ [WIP] Feature: Support Mixtral-8x7B SFT, Reward, and Alignment Sep 24, 2024

haolin-nju added 2 commits September 30, 2024 10:54

fix src_gpu to get_or_cache

d7152ac

revert "fix src_gpu to get_or_cache"

a554cf6

haolin-nju changed the title ~~[WIP] Feature: Support Mixtral-8x7B SFT, Reward, and Alignment~~ [WIP][feature] Support Mixtral-8x7B SFT, Reward, and Alignment Oct 22, 2024

loofahcus reviewed Oct 23, 2024

View reviewed changes

haolin-nju added 7 commits November 5, 2024 16:11

Merge branch 'main' into mixtral

5424d0b

fix comments

caebecf

fix import error

ef30253

Merge branch 'mixtral' of github.com:alibaba/ChatLearn into mixtral

752f5ef

fix diff introduced in merge

7cc7aa9

Merge branch 'main' into mixtral

7420414

expost import error earlier

c027414

haolin-nju added 9 commits November 7, 2024 02:06

fix merge error

77f4230

fix recursion error

2896285

add validate_param_sync option to mixtral models

333e661

fix redundant empty lines

d86136f

Merge branch 'main' into mixtral

4053415

fix scripts

cd00af0

mixtral sft ok

7ed2eff

fix sft

f23ac37

Merge branch 'main' into mixtral

27bdbcc

haolin-nju mentioned this pull request Dec 5, 2024

[BUG]是否支持 deepseek v2 (moe)系列的一系列转换和训练 #166

Closed

haolin-nju changed the title ~~[WIP][feature] Support Mixtral-8x7B SFT, Reward, and Alignment~~ [WIP]feature(mixtral): support Mixtral-8x7B SFT, Reward, and Alignment Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP]feature(mixtral): support Mixtral-8x7B SFT, Reward, and Alignment #95

[WIP]feature(mixtral): support Mixtral-8x7B SFT, Reward, and Alignment #95

haolin-nju commented Sep 24, 2024

loofahcus Oct 23, 2024

haolin-nju Nov 7, 2024

loofahcus Oct 23, 2024

haolin-nju Nov 7, 2024

loofahcus Oct 23, 2024

haolin-nju Nov 7, 2024

loofahcus Oct 23, 2024

haolin-nju Nov 7, 2024

		NNODES=$WORLD_SIZE


		dp=$(($WORLD_SIZE * $GPUS_PER_NODE / $tp / $pp / $ep))

[WIP]feature(mixtral): support Mixtral-8x7B SFT, Reward, and Alignment #95

Are you sure you want to change the base?

[WIP]feature(mixtral): support Mixtral-8x7B SFT, Reward, and Alignment #95

Conversation

haolin-nju commented Sep 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment