This repository has been archived by the owner on Sep 24, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 124
Found 2 bugs in declaring hidden sizes #46
Open
davide-giacomini
wants to merge
78
commits into
kevinzakka:master
Choose a base branch
from
davide-giacomini:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…zed bits for g_t and h_t Added to config files the arguments for the number of quantized bits. NOTE THAT THOSE NUMBER DOES NOT AUTOMATICALLY CHANGE THE QUANTIZATION. It's done manually only for the purpose of changing the filename of the checkpoints.
- Now you can change the quantization bits from the configuration file
- Created a function named `quantize_tensor(..)` in `utils.py`. - Added the min{gt} in the formula for minimum values different from zero. - Added two jupyter notebooks for convenience
Now the Relu1 is only used in case h_t is quantized
Added checkpoints to GitHub
48 epochs done
Nobody saw it because glimpse_hidden and locatin_hidden were always the same size (128x128). Although, when declaring RecurrentAttention(..), the parameters are swapped. If you change size, you can see it.
The input_size of the CoreNetwork passed through from the model.py was `hidden_size`, which is not actually the real input. The real input is loc_hidden+glimpse_hidden, and I changed it to that. Nobody saw it until now because 128+128=256, which is the same size of the hidden_size of h_t.
The input_size of the CoreNetwork passed through from the model.py was `hidden_size`, which is not actually the real input. The real input is loc_hidden+glimpse_hidden, and I changed it to that. Nobody saw it until now because 128+128=256, which is the same size of the hidden_size of h_t. Moreover, when declaring RecurrentAttention(..), the parameters are swapped. Nobody saw it because glimpse_hidden and locatin_hidden were always the same size (128x128). Although, if you change size, you can see it.
In this draft, I have already the table with all the time-step subsequent one after the other in one single row. So, I directly go the last column and take the result
…use memory-based inference The code is not extremely clean, but it works for now... - Added --is_train_table and --mem_based_inference in config - Added the get_train_table_loader() in data_loader.py to avoid each time changing the code. I would have made an error probably - Changed the main to actually check all possible config parameter - Added methods in trainer.py separate from test(), so that the code is isolated, even though very similar - Added in utils.py a class EQUAL to Retina but without the denormalization of the location. Man, I don't have time to do it right
Now I can run everything from command line without touching the code
…ients There is no more the option to use other distances or more than 3 coefficients
… for output_size_ht
I didn't quantize phi in the first place after foveate during memory-based inference
The input image, for extracting the patches, was padded floor(patch_size/2), but it didn't work with odd patch_sizes. To have odd patch_sizes it must be ceil(patch_size/2)
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Both bugs can be easily tested if changed the sizes in config.py.
The loc_hidden and glimpse_hidden were swapped when declaring RecurrentAttention(...), and the input_size of the CoreNetwork was wrong.