Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Multimodel Pretraining step #32

Open
shubhamgarg21 opened this issue Mar 12, 2024 · 1 comment
Open

Missing Multimodel Pretraining step #32

shubhamgarg21 opened this issue Mar 12, 2024 · 1 comment

Comments

@shubhamgarg21
Copy link

Hi,

For the paper https://arxiv.org/pdf/2310.01218.pdf , the following is mentioned in pretraining section :

For efficiency, we first train SEED-LLaMA using LoRA [32] tuning and together optimize the
parameters of the embedding layer and decoder head layer due to the added visual codes. We then
merge the parameters of LoRA onto the LLM backbone and fine-tune all parameters except for
the embedding layer.

But in the training steps, the part about fine-tuning all parameters except for the embedding layer is missing.

zheedong added a commit to KU-AGI/SEED that referenced this issue Mar 13, 2024
@Cerf-Volant425
Copy link

Same question, could you please add the corresponding script of the fine-tuning part?
Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants