Skip to content
This repository has been archived by the owner on Sep 24, 2023. It is now read-only.

Performance is not good when using my dataset. #16

Open
bemoregt opened this issue May 18, 2018 · 13 comments
Open

Performance is not good when using my dataset. #16

bemoregt opened this issue May 18, 2018 · 13 comments

Comments

@bemoregt
Copy link

bemoregt commented May 18, 2018

Hi, @kevinzakka

I entered my own data with MNIST Format(256x256, Gray Images, 5000 Images/class)

But Performance is not good.

What's wrong with me?


Epoch: 196/500 - LR: 0.000300
0.8s - loss: 1.055 - acc: 100.000: 100%|█████████| 217/217 [00:00<00:00, 267.85it/s]
train loss: 0.834 - train acc: 62.212 - val loss: 1.192 - val acc: 54.167

Epoch: 197/500 - LR: 0.000300
0.8s - loss: -0.885 - acc: 100.000: 100%|████████| 217/217 [00:00<00:00, 273.63it/s]
train loss: 0.568 - train acc: 60.369 - val loss: 0.844 - val acc: 54.167

Epoch: 198/500 - LR: 0.000300
0.8s - loss: 0.780 - acc: 100.000: 100%|█████████| 217/217 [00:00<00:00, 270.30it/s]
train loss: 0.565 - train acc: 57.604 - val loss: 1.076 - val acc: 50.000

Epoch: 199/500 - LR: 0.000300
0.8s - loss: 3.553 - acc: 0.000: 100%|███████████| 217/217 [00:00<00:00, 271.82it/s]
train loss: 0.678 - train acc: 58.525 - val loss: 0.533 - val acc: 58.333

Epoch: 200/500 - LR: 0.000300
0.8s - loss: 0.116 - acc: 100.000: 100%|█████████| 217/217 [00:00<00:00, 272.74it/s]
train loss: 0.651 - train acc: 58.986 - val loss: 1.418 - val acc: 45.833

Epoch: 201/500 - LR: 0.000300
0.8s - loss: 5.108 - acc: 0.000: 100%|███████████| 217/217 [00:00<00:00, 275.17it/s]
train loss: 0.779 - train acc: 63.594 - val loss: 0.921 - val acc: 62.500

Epoch: 202/500 - LR: 0.000300
0.8s - loss: 1.587 - acc: 0.000: 100%|███████████| 217/217 [00:00<00:00, 270.84it/s]
train loss: 0.830 - train acc: 58.525 - val loss: 0.746 - val acc: 58.333
[!] No improvement in a while, stopping training.

Thanks.

from @bemoregt.

@kevinzakka
Copy link
Owner

kevinzakka commented May 26, 2018

@bemoregt your image size is way bigger than MNIST (256x256) so there are few hyperparameters you'd have to tweak to improve performance. For example, you can try increasing the patch size, the number of patches per glimpse and the number of glimpses taken per image. You could also try to increase the hidden size of the RNN, etc.

@bemoregt
Copy link
Author

bemoregt commented May 28, 2018

Hi, @kevinzakka

I'll try that.
And Is there any other tweak options for my 256x256 data?

My images are 256x256xgrayscale, thin line defects in solarcell problems. and Image Augmented to 5000 images/class by retinex filtering/Unsharp masking/filp/flop, etc.

Thanks in advance.

from @bemortgt.

@duygusar
Copy link

@bemoregt Hi there, I was wondering if you have come across any tensor mismatch problems while training with your dataset? Can you please take a look into this issue? #19 I appreciate it, I can not trace why it fails.

@bemoregt
Copy link
Author

bemoregt commented Jun 21, 2018

@kevinzakka @duygusar

There is no error happen at my train time.
But RAM's final accuracy is just 87.8% with my data.(Image Classification).

Otherwise, When my CNN+Supervised Learning, final accuracy is 99.8% with my same dataset.

What's wrong with me?

2018-06-21 9 05 24

2018-06-21 9 15 45

Thanks at any rate.

from @bemoregt.

@bemoregt
Copy link
Author

@kevinzakka @duygusar

My Params here:

glimpse network params

glimpse_arg = add_argument_group('Glimpse Network Params')
glimpse_arg.add_argument('--patch_size', type=int, default=64,
help='size of extracted patch at highest res')
glimpse_arg.add_argument('--glimpse_scale', type=int, default=2,
help='scale of successive patches')
glimpse_arg.add_argument('--num_patches', type=int, default=1,
help='# of downscaled patches per glimpse')
glimpse_arg.add_argument('--loc_hidden', type=int, default=128,
help='hidden size of loc fc')
glimpse_arg.add_argument('--glimpse_hidden', type=int, default=128,
help='hidden size of glimpse fc')

core network params

core_arg = add_argument_group('Core Network Params')
core_arg.add_argument('--num_glimpses', type=int, default=6,
help='# of glimpses, i.e. BPTT iterations')
core_arg.add_argument('--hidden_size', type=int, default=256,
help='hidden size of rnn')

reinforce params

reinforce_arg = add_argument_group('Reinforce Params')
reinforce_arg.add_argument('--std', type=float, default=0.17,
help='gaussian policy standard deviation')
reinforce_arg.add_argument('--M', type=float, default=10,
help='Monte Carlo sampling for valid and test sets')

data params

data_arg = add_argument_group('Data Params')
data_arg.add_argument('--valid_size', type=float, default=0.1,
help='Proportion of training set used for validation')
data_arg.add_argument('--batch_size', type=int, default=32,
help='# of images in each batch of data')
data_arg.add_argument('--num_workers', type=int, default=4,
help='# of subprocesses to use for data loading')
data_arg.add_argument('--shuffle', type=str2bool, default=True,
help='Whether to shuffle the train and valid indices')
data_arg.add_argument('--show_sample', type=str2bool, default=True,
help='Whether to visualize a sample grid of the data')

training params

train_arg = add_argument_group('Training Params')
train_arg.add_argument('--is_train', type=str2bool, default=True,
help='Whether to train or test the model')
train_arg.add_argument('--momentum', type=float, default=0.5,
help='Nesterov momentum value')
train_arg.add_argument('--epochs', type=int, default=500,
help='# of epochs to train for')
train_arg.add_argument('--init_lr', type=float, default=3e-4,
help='Initial learning rate value')
train_arg.add_argument('--lr_patience', type=int, default=10,
help='Number of epochs to wait before reducing lr')
train_arg.add_argument('--train_patience', type=int, default=90,
help='Number of epochs to wait before stopping train')

other params

misc_arg = add_argument_group('Misc.')
misc_arg.add_argument('--use_gpu', type=str2bool, default=False,
help="Whether to run on the GPU")
misc_arg.add_argument('--best', type=str2bool, default=True,
help='Load best model or most recent for testing')
misc_arg.add_argument('--random_seed', type=int, default=22,
help='Seed to ensure reproducibility')
misc_arg.add_argument('--data_dir', type=str, default='./data',
help='Directory in which data is stored')
misc_arg.add_argument('--ckpt_dir', type=str, default='./ckpt',
help='Directory in which to save model checkpoints')
misc_arg.add_argument('--logs_dir', type=str, default='./logs/',
help='Directory in which Tensorboard logs wil be stored')
misc_arg.add_argument('--use_tensorboard', type=str2bool, default=False,
help='Whether to use tensorboard for visualization')
misc_arg.add_argument('--resume', type=str2bool, default=False,
help='Whether to resume training from checkpoint')
misc_arg.add_argument('--print_freq', type=int, default=10,
help='How frequently to print training details')
misc_arg.add_argument('--plot_freq', type=int, default=1,
help='How frequently to plot glimpses')

@duygusar
Copy link

@bemoregt Thank you for sharing your parameters, when/if I can make it work I will report on the performance. One thing I can think of is the implementation has room for improvement e.g. not start with center patches but do it randomly (if I am not wrong this implementation uses center point to initialize).

I think @kevinzakka might have a better answer but the performance could also be related to many things. Do you think your CNN classification might be overfitting? Attention model might be sort of doinga similar effect to augmenting your data. Or perhaps it is the nature of your data. Attention models seem to work better only in particular cases (for example it performs better on a noise added MNIST set than MNIST)

@duygusar
Copy link

duygusar commented Jun 21, 2018

On the other hand I think my problem might be that my images are 427x240 RGB images. I have come across some posts on pytorch having problem with certain image sizes and they are advised to tackle it by using adaptive avg pooling layer rather than avg pooling.. I will try that. Or maybe it has to do with center initialization or padding of tensors. Still no idea. @kevinzakka

@duygusar
Copy link

duygusar commented Jun 21, 2018

@bemoregt It seems the implementation is not yet a full implementation of the paper. Just fyi as that might be the issue with performance. It never occurred to me to check the closed issues- I did go through the code but not in great detail as the readme gave the impression that it was a full paper implementation as it was also featured on torch blogs - and I am new to pytorch.

#13
#1

@linzhiqiu
Copy link

linzhiqiu commented Jun 25, 2018

I am also thinking of using this model for my own dataset, but from your discussion it looks like: 1/ This implementation is not complete yet, 2/ This RAM only works well on certain datasets, so its power is actually quite limited. Am I right?

@duygusar
Copy link

duygusar commented Jun 25, 2018

@linzhiqiu 1. Ok, my comment was not a caution against using this repository, I was just guessing why people have performance issues, and it seems I am not the only one. So yes, I think it would be better if everything is more clear. The repo is a functional minimal example of the model working with MNIST minus some aspects (like random search), it also does not yet support batches etc. although it is written in a way to (so I suppose it is left undone) So imho, expanding it to my dataset and having a close to the paper implementation is not as straightforward as I thought it was through the blog and readme here. Alas, this is a good opportunity for people to contribute to the project, but personally I am new to pytorch.

  1. I don't want to generalize but I would say yes, it depends on the nature of your data. You would be better off using a simple CNN with MNIST. But if you are working with the cluttered, noise added, translation varied version of MNIST, attention model performs better. It also depends on the problem because with image to text problems, it might be a better concept to use than just for classification.

@kevinzakka
Copy link
Owner

kevinzakka commented Jun 26, 2018

@duygusar @linzhiqiu the way I wrote the repository makes it so that you just have to modify the data_loader.py file to make it work for your own needs. The implementation does support batching; in fact, the reported accuracies were achieved using a batch size of 32.

The only current problem with this implementation is that the retina module is not efficient and the CPU version currently runs faster than the GPU. I had in mind to write a custom kernel using the new cpp extension but never got the time.

@duygusar
Copy link

@kevinzakka I did write my own dataloader, I have mentioned it here #19 And I was only referring to your comment here about batches #1 Since the suggested consensus way of handling different size of images for pytorch didn't solve my problem, and considering your comment I assumed it could be the batches or the color channel.

@chatgptcoderhere
Copy link

Hi, @kevinzakka

I entered my own data with MNIST Format(256x256, Gray Images, 5000 Images/class)

But Performance is not good.

What's wrong with me?

Epoch: 196/500 - LR: 0.000300
0.8s - loss: 1.055 - acc: 100.000: 100%|█████████| 217/217 [00:00<00:00, 267.85it/s]
train loss: 0.834 - train acc: 62.212 - val loss: 1.192 - val acc: 54.167

Epoch: 197/500 - LR: 0.000300
0.8s - loss: -0.885 - acc: 100.000: 100%|████████| 217/217 [00:00<00:00, 273.63it/s]
train loss: 0.568 - train acc: 60.369 - val loss: 0.844 - val acc: 54.167

Epoch: 198/500 - LR: 0.000300
0.8s - loss: 0.780 - acc: 100.000: 100%|█████████| 217/217 [00:00<00:00, 270.30it/s]
train loss: 0.565 - train acc: 57.604 - val loss: 1.076 - val acc: 50.000

Epoch: 199/500 - LR: 0.000300
0.8s - loss: 3.553 - acc: 0.000: 100%|███████████| 217/217 [00:00<00:00, 271.82it/s]
train loss: 0.678 - train acc: 58.525 - val loss: 0.533 - val acc: 58.333

Epoch: 200/500 - LR: 0.000300
0.8s - loss: 0.116 - acc: 100.000: 100%|█████████| 217/217 [00:00<00:00, 272.74it/s]
train loss: 0.651 - train acc: 58.986 - val loss: 1.418 - val acc: 45.833

Epoch: 201/500 - LR: 0.000300
0.8s - loss: 5.108 - acc: 0.000: 100%|███████████| 217/217 [00:00<00:00, 275.17it/s]
train loss: 0.779 - train acc: 63.594 - val loss: 0.921 - val acc: 62.500

Epoch: 202/500 - LR: 0.000300
0.8s - loss: 1.587 - acc: 0.000: 100%|███████████| 217/217 [00:00<00:00, 270.84it/s]
train loss: 0.830 - train acc: 58.525 - val loss: 0.746 - val acc: 58.333
[!] No improvement in a while, stopping training.

Thanks.

from @bemoregt.

WHat changes did you do to make this repo work for your custom data, I am having troubles converting my data to mnist format. Is there any easier way to use this for own data?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants