Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training with custom data, what should I do? #56

Open
ToanICV opened this issue Apr 6, 2024 · 4 comments
Open

Training with custom data, what should I do? #56

ToanICV opened this issue Apr 6, 2024 · 4 comments

Comments

@ToanICV
Copy link

ToanICV commented Apr 6, 2024

Hi. Thank you for your works and open it. I'm trying to train with custom dataset from scratch. I have videos, glosses and texts as well. I plan to train with SingleStream first to have a baseline. I have trained with G2T and result is good (BLEU4 score: 43.76, ROUGE: 70.23). Now I'm training S2G but the result is not good (WER always ~ 95-100 and loss ~60). I believe that I missing somethings. So may you tell me a road map or something familiar to train with custom dataset? Thank you so much.

@ToanICV ToanICV changed the title Training with custom data, which should I do? Training with custom data, what should I do? Apr 6, 2024
@ToanICV
Copy link
Author

ToanICV commented Apr 6, 2024

I have only 2 x GPUs P40 24GB. Train command:

python3 -m torch.distributed.launch --nproc_per_node 2 --use_env training.py --config experiments/configs/SingleStream/vsl-edu_s2g.yaml

This is my configs:

task: S2G
data:
input_data: videos #features, gloss
zip_file: /root/working/vsl_edu_v2_15.zip
input_streams:
- rgb
dataset_name: vsl-edu
level: word #word or char
txt_lowercase: true
max_sent_length: 400
train: /root/input/vsl-edu-labels/myVSL_p2_15.train
dev: /root/input/vsl-edu-labels/myVSL_p2_15.dev
test: /root/input/vsl-edu-labels/myVSL_p2_15.test
transform_cfg:
img_size: 224
aug_hflip: false
color_jitter: false
bottom_area: 0.9
csl_cut: False
csl_resize:
- 224
- 224
center_crop: false
center_crop_size: 270
randomcrop_threshold: 1
aspect_ratio_min: 0.75
aspect_ratio_max: 1.3
temporal_augmentation:
tmin: 0.5
tmax: 1.5
testing:
cfg:
recognition:
beam_size: 5
training:
random_seed: 44
overwrite: False
model_dir: /root/working/TwoStreamNetworkVi/TwoStreamNetwork/experiments/outputs/SingleStream/vsl-edu_s2g
shuffle: True
num_workers: 4
batch_size: 1
total_epoch: 100
keep_last_ckpts: 1
validation:
unit: epoch
freq: 1
cfg:
recognition:
beam_size: 1
optimization:
optimizer: Adam
learning_rate:
default: 5.0e-2
weight_decay: 0.001
betas:
- 0.9
- 0.998
scheduler: cosineannealing
t_max: 40
model:
RecognitionNetwork:
GlossTokenizer:
gloss2id_file: /root/input/vsl-edu-labels/gloss2ids_vsl_p2_15.pkl
s3d:
pretrained_ckpt: pretrained_models/s3ds_glosscls_ckpt
use_block: 4
freeze_block: 1
visual_head:
input_size: 832
hidden_size: 512
ff_size: 2048
pe: True
ff_kernelsize:
- 3
- 3

@HebaRaslan
Copy link

peace be upon you
how you custom data to G2T task ...can you tell me the roadmap for it

@len2618187
Copy link

Have you found a solution?

@2000ZRL
Copy link
Collaborator

2000ZRL commented Jun 7, 2024

Well, it is difficult to debug given just the configs... Basically, to train on your custom dataset, you only need to use your own split files (xxx.train, xxx.dev, xxx.test), and modify the dataloader if necessary. For the S2G task, you only need to make sure the video-gloss pairs are correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants