You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your nice work. @yyk-wew
But I have some problems when training this network. I download the FF++ dataset and try to train it. Unfortunately, on the raw videos, the test result is just about 0.7. So I have two questions as follow:
1: I notice that the author set batch size as 128 and trained for 150k iterations. But in my experiment, the batch size is set as 64 and I train for 40k iterations. Thus could you offer some details about training? Do I really need to train it for so long ?
2: About dataset. In the original paper, the author mention that the size of the real videos is augmented four times, and crop the face in each video. But I am not sure the my own method of precessing video is right or not. Could you tell some information about them?
Thanks a lot again.
The text was updated successfully, but these errors were encountered:
In my experiment on FF++ of LQ (c40), the batch size is set as 32 and the AUC stops increasing in about 20k-40k iterations. I guess it might have to do with the CosineAnnealingLR. I didn't use it in my experiment because a bad setting of scheduler parameters influenced the performance significantly. And it seemed that the author didn't give their setting in the paper.
For data preprocessing, check this repo. This is the official repo of FF++ dataset. The author also mentioned the data processing pipelines in the paper.
Thanks for your nice work. @yyk-wew
But I have some problems when training this network. I download the FF++ dataset and try to train it. Unfortunately, on the raw videos, the test result is just about 0.7. So I have two questions as follow:
1: I notice that the author set batch size as 128 and trained for 150k iterations. But in my experiment, the batch size is set as 64 and I train for 40k iterations. Thus could you offer some details about training? Do I really need to train it for so long ?
2: About dataset. In the original paper, the author mention that the size of the real videos is augmented four times, and crop the face in each video. But I am not sure the my own method of precessing video is right or not. Could you tell some information about them?
Thanks a lot again.
The text was updated successfully, but these errors were encountered: