Found 2 bugs in declaring hidden sizes #46

davide-giacomini · 2023-01-23T20:35:08Z

Both bugs can be easily tested if changed the sizes in config.py.

The loc_hidden and glimpse_hidden were swapped when declaring RecurrentAttention(...), and the input_size of the CoreNetwork was wrong.

…zed bits for g_t and h_t Added to config files the arguments for the number of quantized bits. NOTE THAT THOSE NUMBER DOES NOT AUTOMATICALLY CHANGE THE QUANTIZATION. It's done manually only for the purpose of changing the filename of the checkpoints.

- Now you can change the quantization bits from the configuration file

- Created a function named `quantize_tensor(..)` in `utils.py`. - Added the min{gt} in the formula for minimum values different from zero. - Added two jupyter notebooks for convenience

Now the Relu1 is only used in case h_t is quantized

Added checkpoints to GitHub

48 epochs done

Nobody saw it because glimpse_hidden and locatin_hidden were always the same size (128x128). Although, when declaring RecurrentAttention(..), the parameters are swapped. If you change size, you can see it.

The input_size of the CoreNetwork passed through from the model.py was `hidden_size`, which is not actually the real input. The real input is loc_hidden+glimpse_hidden, and I changed it to that. Nobody saw it until now because 128+128=256, which is the same size of the hidden_size of h_t.

The input_size of the CoreNetwork passed through from the model.py was `hidden_size`, which is not actually the real input. The real input is loc_hidden+glimpse_hidden, and I changed it to that. Nobody saw it until now because 128+128=256, which is the same size of the hidden_size of h_t. Moreover, when declaring RecurrentAttention(..), the parameters are swapped. Nobody saw it because glimpse_hidden and locatin_hidden were always the same size (128x128). Although, if you change size, you can see it.

In this draft, I have already the table with all the time-step subsequent one after the other in one single row. So, I directly go the last column and take the result

…use memory-based inference The code is not extremely clean, but it works for now... - Added --is_train_table and --mem_based_inference in config - Added the get_train_table_loader() in data_loader.py to avoid each time changing the code. I would have made an error probably - Changed the main to actually check all possible config parameter - Added methods in trainer.py separate from test(), so that the code is isolated, even though very similar - Added in utils.py a class EQUAL to Retina but without the denormalization of the location. Man, I don't have time to do it right

Now I can run everything from command line without touching the code

…s()**d

…ients There is no more the option to use other distances or more than 3 coefficients

… for output_size_ht

I didn't quantize phi in the first place after foveate during memory-based inference

The input image, for extracting the patches, was padded floor(patch_size/2), but it didn't work with odd patch_sizes. To have odd patch_sizes it must be ceil(patch_size/2)

davide-giacomini added 30 commits January 8, 2023 16:24

Commented momentarily some files for better understanding of the code

2779e13

Updated gitignore for VSCode

7b5e836

First trial of quantization

516ba89

Cleaned code that I added for debugging

9340de4

Cleaned code for quantization

cd38271

- Now you can change the quantization bits from the configuration file

Cleaned quantization code

373c942

- Created a function named `quantize_tensor(..)` in `utils.py`. - Added the min{gt} in the formula for minimum values different from zero. - Added two jupyter notebooks for convenience

Changed gitignore because jupyter is of no use for the project

986da84

Formula taken from the paper

4cb8757

Added phi quantization (patch vector)

7c3ee72

Ht Relu limited to max=1

6a30d1d

Quantized lt

db51f2f

Added possibility to quantize only in inference

40564df

Now the Relu1 is only used in case h_t is quantized

Changed gitignore

d82e74b

Added checkpoints to GitHub

Added checkpoints from colab

881d1a5

Merged colab and local checkpoints

0f51e40

Updated 20_2x2_8_2_0_0_0_1

55de70e

48 epochs done

Updated local checkpoints completed until 01/23

1b7246d

Updated colab checkpoints until 01/23

45bccf7

merged colab and local checkpoints completed until 01/23

ff20088

forgot these checkpoints

b6b03fb

Updated local checkpoints until morning 01/23

159b0a4

Updated colab checkpoints completed until morning 01/23

7b35a58

Bug issue: inverted loc_hidden and glimpse_hidden

0adecc0

Nobody saw it because glimpse_hidden and locatin_hidden were always the same size (128x128). Although, when declaring RecurrentAttention(..), the parameters are swapped. If you change size, you can see it.

Just a simple comment

3131263

Added an hidden layer to the rnn network

39f910f

Changed checkpoint directory for 1hidden more

c027dc1

Update local ckpts 1hidden

7eed6fb

davide-giacomini added 30 commits January 31, 2023 01:44

Fixed bug which prenvented me to use cuda

dbd00fa

Changed config directory for the new formula quantization formula

f25ec1b

Updated ckpts with normal quantization

3f52bdc

Fixed max and min of ht when quantized. So we have fixed steps

c97aaa7

Added script to convert the training_table into an inference_table

0cdf91f

Added memory_based_inference first draft

40cb760

In this draft, I have already the table with all the time-step subsequent one after the other in one single row. So, I directly go the last column and take the result

Fixed bugs in synth_inference_table.py

edd25b4

Cleaned synth_inference_table.py

67a3160

Changed the code in utils.py for using the gpu

db3f05f

Added the possibility of doing baysean optimization

e3a6525

Cleaned code with BO

7bdcc2e

Updated distance function with Manhattan distance

a8e1e1e

I duplicated a lot of code to be able to automate the trials for BO

77a3674

Now I can run everything from command line without touching the code

When there are all the coefficients I don't use pow(2*d), but only ab…

d8914f2

…s()**d

Config file cleaned

9c371ba

updated server ckpts config 5_4x4_4_4_0_0_2_0

c9079d9

merge server with local

62631b4

Cleaned name of training table and updated gitignore for specific files

d210060

Changed the code to only use manhattan distance without power coeffic…

1f3d973

…ients There is no more the option to use other distances or more than 3 coefficients

Added noise coeeficient

48c3ee8

Put every checkpoint under the same directory and added the indicator…

f03cf9e

… for output_size_ht

Automated the read from file of phi_max and phi_min

f3488e8

Created a script to generate plots. I still have to finish it

2877971

Created code for generating the plot of size_ht

8a15040

Generalized generate_plot

42effad

Finished the generation of plots semi-automatic

4311082

Removed prova.py from remote repository

11abbb2

Fixed bug quantization phi

a7bb744

I didn't quantize phi in the first place after foveate during memory-based inference

Bug fixed in padding the input image

0f4b8ed

The input image, for extracting the patches, was padded floor(patch_size/2), but it didn't work with odd patch_sizes. To have odd patch_sizes it must be ceil(patch_size/2)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Found 2 bugs in declaring hidden sizes #46

Found 2 bugs in declaring hidden sizes #46

davide-giacomini commented Jan 23, 2023

Found 2 bugs in declaring hidden sizes #46

Are you sure you want to change the base?

Found 2 bugs in declaring hidden sizes #46

Conversation

davide-giacomini commented Jan 23, 2023