Split nn speedup #318

KeremTurgutlu · 2021-04-24T06:52:44Z

Description

Modified activation calculation to speed things up. For example, calling numpy() and detach() in a for loop over and over again is very slow and bad practice.
Added torch.no_grad() context to speed things up, since there is no need to calculate gradients, they are not needed.
I am not sure but it seems like original notebook was overwriting the same .npy file at each for loop iteration, so instead saved the final array at the end. This change allows speed up from 3 mins -> 10 secs.
Mutual information calculation is very slow, so sliced the image and activation vectors to first 1,000 samples. 1,000 samples finishes in seconds and enough to see the difference between self-information and mutual-information between original image vector vs. activation.
Added mutual information formula just for fun.

Affected Dependencies

None.

How has this been tested?

Run this notebook end-to-end to reproduce.
Also run the original notebook.
Time both, you will see the speed up and the potential bug which overwrites the saved file.

review-notebook-app · 2021-04-24T06:52:49Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

noise-field · 2021-04-28T23:33:28Z

Hi, @KeremTurgutlu!

May I suggest another improvement to this PR? (I could have added another PR of my own, but I think resolving conflicts would be quite difficult then). The main thing that slows the computation down here is the use of batch size=1. Do you think we could get rid of this data_test_loader2 altogether and do something like (incorporating both your changes and larger batches):

imgs = []
intermediate_activations = []
total_correct = 0

model_loaded.eval()
with torch.no_grad():
    for i, (images, labels) in tqdm(enumerate(data_test_loader), total=len(data_test_loader)):  # <- note the data_test_loader here
        imgs.append(images.view(images.shape[0], -1))

        x = model_loaded.convnet(images)
        intermediate_activations.append(x.view(x.shape[0], -1))
        
    np.save("images", torch.cat(imgs).numpy())
    np.save("intermediate_act", torch.cat(intermediate_activations).numpy())

This will make the code run another 4 times faster.

Another suggestion is to add

np.random.seed(0)  # or any number you like

before each of the two cells that calculate MI, because as far as I can see the estimation algorithm is randomized, so the results can be different each time, especially if you only use 1000 data points. You might also want to change the expected results in the last markdown cell accordingly.

KeremTurgutlu · 2021-04-29T06:56:45Z

Thanks for the suggestion, I will update the PR once I have time!

before each of the two cells that calculate MI, because as far as I can see the estimation algorithm is randomized, so the results can be different each time, especially if you only use 1000 data points. You might also want to change the expected results in the last markdown cell accordingly.

Makes sense. I actually wanted to find another library which implements MI. If you know any feel free to share, the library used in the notebook require manual download I guess, and couldn't find it in pip or github either.

speed up act calc, add notes and slice for mutual information calc

c5ab719

KeremTurgutlu changed the base branch from master to foundations-of-private-computation April 24, 2021 06:53

KeremTurgutlu added 2 commits April 24, 2021 00:00

revert dataset setup and clean nb

40ad96c

revert data path

31a35b3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split nn speedup #318

Split nn speedup #318

KeremTurgutlu commented Apr 24, 2021 •

edited

Loading

review-notebook-app bot commented Apr 24, 2021

noise-field commented Apr 28, 2021 •

edited

Loading

KeremTurgutlu commented Apr 29, 2021

Split nn speedup #318

Are you sure you want to change the base?

Split nn speedup #318

Conversation

KeremTurgutlu commented Apr 24, 2021 • edited Loading

Description

Affected Dependencies

How has this been tested?

review-notebook-app bot commented Apr 24, 2021

noise-field commented Apr 28, 2021 • edited Loading

KeremTurgutlu commented Apr 29, 2021

KeremTurgutlu commented Apr 24, 2021 •

edited

Loading

noise-field commented Apr 28, 2021 •

edited

Loading