Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement vLLM FSDP LoRA hot-swapping integration #10

Open
wants to merge 92 commits into
base: master
Choose a base branch
from

Conversation

jacobthebanana
Copy link
Collaborator

This pull request enables vLLM to run in parallel with VectorLM on the same set of GPUs. Additionally, this pull request includes an example of LoRA adapter hot-swapping for tracking the behavior of the model during the training process.

jacobthebanana and others added 30 commits February 26, 2024 17:07
Added support for non-fsdp models.
trainer: replaced clip_grad_norm_ with nn.utils.clip_grad_norm_ for lora compatibility.
Set model path to local copy of llama-2-7b in example config.
…is method no longer wraps load_model_and_tokenizer)

test_modelling: revised base model fixture scope since torch FSDP wrap is in-place.
launch_benchmark: added confirmation before launching.
…enchmarking

* added changes to implement low cpu mem usage feature

* implemented new ruff linting changes and ran a fix across files
Still need to move barrier logic into _VLLMCallbackWrapper.
docs/config.md Outdated

### Sampling during Training

To disable sampling during training, delete the entire "sampling" section.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we "deleting" the section or just commenting out?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Comment out" might be sufficient, as it allows the user to easily re-enabled the sampling engine as needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file required to be a part of the main codebase?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That config file has been included by mistake. I will delete that from version control.

docs/sampling.md Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great writeup!

wandb_config:
project: vector-lm-verify
name: benchmark-lora
# tags: ["20240418-1a-preemption"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need an init file in examples. It's not part of the package installation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I have also added some verification logic to ensure that users are invoking the wrapper correctly.

@jacobthebanana jacobthebanana marked this pull request as ready for review June 18, 2024 00:53
@jacobthebanana jacobthebanana requested a review from adil-a June 18, 2024 00:53
…d importing vLLM when not required.

Ruff formatting fixes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants