-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance seems to be very low #33
Comments
I think not being able to use AMP (even if u try to use it u get an error on Albumentations' end) might be hurting performance and batch size. Also, perhaps Pytorch Lightning manual mode is just a lot slower. Need to test that. It also seems like it is doing a generative step and a discriminative step. So perhaps that is slowing it a bit; but it really shouldn't slow it down too much, as inference is much faster than training as gradients are not required. Lastly, it does not seem to matter what size of model you use; speed is similar; which suggests a CPU bottleneck; unsure why one would get such a bottleneck on pytorch-lightning. Will similarly explore a bit and see if I can pin down the exact cause and otherwise switch to some of the better more recent methods. (DADA, Adv AA, official faster auto-augment, RandAugment search). Maybe I can gain insights on how to train/search policies, in general, using Albumentations, till now I had to edit each augmentation by hand to try something like RandAugment (varying magnitude), would be nice to have a policy parser that I can generate programmatically to vary the magnitude. |
You can reduce the training set size to 4,000 as the authors of Faster AA show in their paper. |
I'm trying to explore the usage the AutoAlbument for semantic segmentation task with default generated search.yaml.
The custom dataset has around 29000 RGB images and corresponding masks (height x width - 512 x 512). I'm running it on a single A100 GPU. I'm using max batch size of 8, I could fit only so much in memory without OOM errors. I see GPU getting utilized fine, utilization fluctuates between (35%->75%->99%->100%).
Issue
Looks like the approximate time required to complete autoalbument-search seems to be close 5 days (for 20 epochs) based on the output below, which seems to be too high. Is there a better optimized way to obtain augmentation policies generated by AutoAlbument?
Because it's too expensive to run it for 5 continuous days.
Current Output of autoalbument-search:
Segments from search.yaml:
architecture: Unet
encoder_architecture: resnet18
pretrained: true
dataloader:
target: torch.utils.data.DataLoader
batch_size: 8
shuffle: true
num_workers: 16
pin_memory: true
drop_last: true
The text was updated successfully, but these errors were encountered: