-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproduce Results #82
Comments
Hi @kalifadan , What GPU did you run the experiments on? It should not be the problem of the number of training epochs as in our implementation the best model on validation set would be saved. There are many factors that would affect the final performance, such as |
Thank you for the comment! My GPUs are: I follow the values in your paper for the If you can compare it to your values it will help :) |
Hi, I ran the experiments on 8 A100 GPUs, setting If you could fine-tune ESM-2 with the same setting you fine-tuned SaProt and find its result still lower than reported by our paper, I think it is something like systematic bias? |
I'm trying now :) |
Yes. It is listed in each config file for each task:) |
Nice! |
Sorry we have no longer saved those models because it has been too long since the paper was released (around 1.5 year) :( |
Hey, I'm trying to get the results of the EC and Thermostability tasks, with the following config, but getting lower results (for example, 0.712 in the Thermostability and 0.866 in the EC). What can it be? Is the number of epochs too big?
Thank you!!
EC:
setting:
seed: 20000812
os_environ:
WANDB_API_KEY: ~
WANDB_RUN_ID: ~
CUDA_VISIBLE_DEVICES: 0,1,2,3 # ,4,5,6,7
MASTER_ADDR: localhost
MASTER_PORT: 12315
WORLD_SIZE: 1
NODE_RANK: 0
wandb_config:
project: EC
name: SaProt_650M_AF2
model:
model_py_path: saprot/saprot_annotation_model
kwargs:
config_path: weights/PLMs/SaProt_650M_AF2
load_pretrained: True
anno_type: EC
lr_scheduler_kwargs:
last_epoch: -1
init_lr: 2.0e-5
on_use: false
optimizer_kwargs:
betas: [0.9, 0.98]
weight_decay: 0.01
save_path: weights/EC/SaProt_650M_AF2.pt
dataset:
dataset_py_path: saprot/saprot_annotation_dataset
dataloader_kwargs:
batch_size: 4 # 8
num_workers: 4 # 8
train_lmdb: LMDB/EC/AF2/foldseek/train
valid_lmdb: LMDB/EC/AF2/foldseek/valid
test_lmdb: LMDB/EC/AF2/foldseek/test
kwargs:
tokenizer: weights/PLMs/SaProt_650M_AF2
plddt_threshold: 70
Trainer:
max_epochs: 100
log_every_n_steps: 1
strategy:
find_unused_parameters: True
logger: True
enable_checkpointing: false
val_check_interval: 0.1
accelerator: gpu
devices: 4 # 8
num_nodes: 1
accumulate_grad_batches: 4 # 1
precision: 16
num_sanity_val_steps: 0
Thermostability:
setting:
seed: 20000812
os_environ:
WANDB_API_KEY: ~
WANDB_RUN_ID: ~
CUDA_VISIBLE_DEVICES: 0,1,2,3 # ,4,5,6,7
MASTER_ADDR: localhost
MASTER_PORT: 12315
WORLD_SIZE: 1
NODE_RANK: 0
wandb_config:
project: Thermostability
name: SaProt_650M_AF2
model:
model_py_path: saprot/saprot_regression_model
kwargs:
config_path: weights/PLMs/SaProt_650M_AF2
load_pretrained: True
lr_scheduler_kwargs:
last_epoch: -1
init_lr: 2.0e-5
on_use: false
optimizer_kwargs:
betas: [0.9, 0.98]
weight_decay: 0.01
save_path: weights/Thermostability/SaProt_650M_AF2.pt
dataset:
dataset_py_path: saprot/saprot_regression_dataset
dataloader_kwargs:
batch_size: 4 # 8
num_workers: 4 # 8
train_lmdb: LMDB/Thermostability/foldseek/train
valid_lmdb: LMDB/Thermostability/foldseek/valid
test_lmdb: LMDB/Thermostability/foldseek/test
kwargs:
tokenizer: weights/PLMs/SaProt_650M_AF2
mix_max_norm: [40, 67]
plddt_threshold: 70
Trainer:
max_epochs: 200
log_every_n_steps: 1
strategy:
find_unused_parameters: True
logger: True
enable_checkpointing: false
val_check_interval: 0.5
accelerator: gpu
devices: 4
num_nodes: 1
accumulate_grad_batches: 8
precision: 16
num_sanity_val_steps: 0
The text was updated successfully, but these errors were encountered: