Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/main'
Browse files Browse the repository at this point in the history
  • Loading branch information
dchourasia committed Jun 11, 2024
2 parents c1a387b + fe43108 commit 1368642
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 2 deletions.
7 changes: 6 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,11 @@ For example for LLaMA model the modules look like:

You can specify attention or linear layers. With the CLI, you can specify layers with `--target_modules "q_proj" "v_proj" "k_proj" "o_proj"` or `--target_modules "all-linear"`.

#### Recommended target modules per model architecture
As per [LoRA paper](https://arxiv.org/pdf/2106.09685), section 4.2 , by using the query and value projection matrices, we can achieve reasonable quality with efficient GPU utilization. Hence, while thinking about what LoRA adapters to specify, we recommend starting with query and value matrices. You could also refer to the defaults specified by PEFT library for popular model architectures in section [TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING](https://github.com/huggingface/peft/blob/7b1c08d2b5e13d3c99b7d6ee83eab90e1216d4ba/src/peft/utils/constants.py#L70) as a good starting point.

_________________________

### Prompt Tuning:

Specify `peft_method` to `'pt'` . You can additionally pass any arguments from [PromptTuningConfig](https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/tuning/config/peft_config.py#L63).
Expand Down Expand Up @@ -446,4 +451,4 @@ The above runs several tasks with `hendrycksTest-*` being MMLU.

[Prompt Tuning on Twitter Complaints](examples/prompt_tuning_twitter_complaints/README.md)

A good simple example can be found [here](examples/kfto-kueue-sft-trainer.yaml) which launches a Kubernetes-native `PyTorchJob` using the [Kubeflow Training Operator](https://github.com/kubeflow/training-operator/) with [Kueue](https://github.com/kubernetes-sigs/kueue) for the queue management of tuning jobs.
A good simple example can be found [here](examples/kfto-kueue-sft-trainer.yaml) which launches a Kubernetes-native `PyTorchJob` using the [Kubeflow Training Operator](https://github.com/kubeflow/training-operator/) with [Kueue](https://github.com/kubernetes-sigs/kueue) for the queue management of tuning jobs.
3 changes: 2 additions & 1 deletion examples/kfto-kueue-sft-trainer.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ data:
"gradient_accumulation_steps": 4,
"learning_rate": 1e-05,
"response_template": "\n### Label:",
"dataset_text_field": "output"
"dataset_text_field": "output",
"use_flash_attn": false
}
---
apiVersion: "kubeflow.org/v1"
Expand Down

0 comments on commit 1368642

Please sign in to comment.