-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement vLLM FSDP LoRA hot-swapping integration #10
base: master
Are you sure you want to change the base?
Conversation
Added support for non-fsdp models.
trainer: replaced clip_grad_norm_ with nn.utils.clip_grad_norm_ for lora compatibility.
Set model path to local copy of llama-2-7b in example config.
…is method no longer wraps load_model_and_tokenizer) test_modelling: revised base model fixture scope since torch FSDP wrap is in-place. launch_benchmark: added confirmation before launching.
…enchmarking * added changes to implement low cpu mem usage feature * implemented new ruff linting changes and ran a fix across files
…s/config.md accordingly.
…ng configs and documentations.
Still need to move barrier logic into _VLLMCallbackWrapper.
Cleanup is required.
Cleanup is required.
Cleanup is required.
…mize changes required in llama_example.py.
docs/config.md
Outdated
|
||
### Sampling during Training | ||
|
||
To disable sampling during training, delete the entire "sampling" section. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we "deleting" the section or just commenting out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Comment out" might be sufficient, as it allows the user to easily re-enabled the sampling engine as needed.
configs/config_gemma.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file required to be a part of the main codebase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That config file has been included by mistake. I will delete that from version control.
docs/sampling.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great writeup!
configs/config_gemma.yaml
Outdated
wandb_config: | ||
project: vector-lm-verify | ||
name: benchmark-lora | ||
# tags: ["20240418-1a-preemption"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed.
examples/__init__.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need an init file in examples. It's not part of the package installation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I have also added some verification logic to ensure that users are invoking the wrapper correctly.
…d importing vLLM when not required. Ruff formatting fixes.
This pull request enables vLLM to run in parallel with VectorLM on the same set of GPUs. Additionally, this pull request includes an example of LoRA adapter hot-swapping for tracking the behavior of the model during the training process.