Skip to content

LoRA, doesn't seem to be faster. #322

Closed Answered by rasbt
TITC asked this question in Q&A
Aug 15, 2024 · 1 comments · 7 replies
Discussion options

You must be logged in to vote

@TITC I tested this with the larger 1558M model, and LoRA seems to be faster with that one. 5.79 instead of 8.12 min. I updated the table (row 9 vs 12):

Model Weights Trainable token position Trainable layers Context length Training acc Validation acc Test acc Training time CPU/GPU
9 gpt2-xl (1558M) pretrained last all longest train ex. (120) 100.00% 98.66% 98.67% 8.12 min A100
12 gpt2-xl (1558M) pretrained last LoRA longest train ex. (120) 100.00% 98.66% 98.33% 5.79 min A100

Replies: 1 comment 7 replies

Comment options

You must be logged in to vote
7 replies
@TITC
Comment options

@rasbt
Comment options

Answer selected by rasbt
@TITC
Comment options

@rasbt
Comment options

@TITC
Comment options

@rasbt
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants