Skip to content
This repository has been archived by the owner on Sep 24, 2023. It is now read-only.

Found 2 bugs in declaring hidden sizes #46

Open
wants to merge 78 commits into
base: master
Choose a base branch
from

Conversation

davide-giacomini
Copy link

Both bugs can be easily tested if changed the sizes in config.py.

The loc_hidden and glimpse_hidden were swapped when declaring RecurrentAttention(...), and the input_size of the CoreNetwork was wrong.

…zed bits for g_t and h_t

Added to config files the arguments for the number of quantized bits. NOTE THAT THOSE NUMBER DOES NOT AUTOMATICALLY CHANGE THE QUANTIZATION. It's done manually only for the purpose of changing the filename of the checkpoints.
- Now you can change the quantization bits from the configuration file
- Created a function named `quantize_tensor(..)` in `utils.py`.

- Added the min{gt} in the formula for minimum values different from zero.

- Added two jupyter notebooks for convenience
Now the Relu1 is only used in case h_t is quantized
Added checkpoints to GitHub
Nobody saw it because glimpse_hidden and locatin_hidden were always the same size (128x128). Although, when declaring RecurrentAttention(..), the parameters are swapped. If you change size, you can see it.
The input_size of the CoreNetwork passed through from the model.py was `hidden_size`, which is not actually the real input. The real input is loc_hidden+glimpse_hidden, and I changed it to that.

Nobody saw it until now because 128+128=256, which is the same size of the hidden_size of h_t.
The input_size of the CoreNetwork passed through from the model.py was `hidden_size`, which is not actually the real input. The real input is loc_hidden+glimpse_hidden, and I changed it to that.
Nobody saw it until now because 128+128=256, which is the same size of the hidden_size of h_t.

Moreover, when declaring RecurrentAttention(..), the parameters are swapped. Nobody saw it because glimpse_hidden and locatin_hidden were always the same size (128x128). Although,  if you change size, you can see it.
In this draft, I have already the table with all the time-step subsequent one after the other in one single row. So, I directly go the last column and take the result
…use memory-based inference

The code is not extremely clean, but it works for now...

- Added --is_train_table and --mem_based_inference in config
- Added the get_train_table_loader() in data_loader.py to avoid each time changing the code. I would have made an error probably
-  Changed the main to actually check all possible config parameter
- Added methods in trainer.py separate from test(), so that the code is isolated, even though very similar
- Added in utils.py a class EQUAL to Retina but without the denormalization of the location. Man, I don't have time to do it right
Now I can run everything from command line without touching the code
…ients

There is no more the option to use other distances or more than 3 coefficients
I didn't quantize phi in the first place after foveate during memory-based inference
The input image, for extracting the patches, was padded floor(patch_size/2), but it didn't work with odd patch_sizes. To have odd patch_sizes it must be ceil(patch_size/2)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant