LoRA adaptor examples #15

davidrpugh · 2024-10-16T08:32:06Z

LLaMA C++ supports using different LoRA adaptors for the same underlying pre-trained model. The following are the relevant llama-cli flags.

-   `--lora FNAME`: Apply a LoRA (Low-Rank Adaptation) adapter to the model (implies --no-mmap). This allows you to adapt the pretrained model to specific tasks or domains.
-   `--lora-base FNAME`: Optional model to use as a base for the layers modified by the LoRA adapter. This flag is used in conjunction with the `--lora` flag, and specifies the base model for the adaptation.

The text was updated successfully, but these errors were encountered:

davidrpugh · 2024-10-16T09:26:45Z

Once we have a LoRA example, then we can also add an example of how to control extended context.

Extended Context Size

Some fine-tuned models have extended the context length by scaling RoPE. For example, if the original pre-trained model has a context length (max sequence length) of 4096 (4k) and the fine-tuned model has 32k. That is a scaling factor of 8, and should work by setting the above --ctx-size to 32768 (32k) and --rope-scale to 8.

--rope-scale N: Where N is the linear scaling factor used by the fine-tuned model.

davidrpugh added the enhancement New feature or request label Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoRA adaptor examples #15

LoRA adaptor examples #15

davidrpugh commented Oct 16, 2024

davidrpugh commented Oct 16, 2024

LoRA adaptor examples #15

LoRA adaptor examples #15

Comments

davidrpugh commented Oct 16, 2024

davidrpugh commented Oct 16, 2024

Extended Context Size