Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
minostauros committed Aug 2, 2019
0 parents commit bce1fcd
Show file tree
Hide file tree
Showing 30 changed files with 117,066 additions and 0 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
._*
.DS_Store

*.txt
*.pyc
36 changes: 36 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Unsupervised Learning of View-invariant Action Representations
Unofficial implementation for *J. Li, Y. Wong, Q. Zhao, and M. S. Kankanhalli, “Unsupervised Learning of View-invariant Action Representations", NeurIPS, 2018*.

Many parameters, from data generation to network parameters, are not clearly mentioned in the original paper and I couldn't get answer from the authors, so please keep in mind that this unofficial implementation is coded arbitrarily in many parts.

## Instructions
To run this code, you need following data:
- NTU RGB+D depth images, RGB images, and flows in HDF5 format. Data generation is done with codes in ```dataset``` directory. These files are way too big to be shared.
- NTU RGB+D JSON and label text files containing videoname, video length, and action labels. These are in ```assets``` directory.

Dataset directory looks like:
![Dataset Directory Tree](/assets/dataset.png?raw=true "Dataset directory")

**Example**
To train:
```
# First train without Gradient Reversal
python viar.py --ntu-dir /volume1/dataset/NTU_RGB+D_processed/ --batch-size 8 --num-workers 16 --save-dir /volume1/data/VIAR --disable-grl
# Train with Gradient Reversal
python viar.py --ntu-dir /volume1/dataset/NTU_RGB+D_processed/ --batch-size 8 --num-workers 16 --save-dir /volume1/data/VIAR --checkpoint-files '{"all": "/volume1/data/VIAR/VIAR_Jul22_16-05-53/checkpoints/VIAR_Jul22_16-05-53_Ep1_Iter5040.tar"}'
````
To test:
```
python viar.py --test --ntu-dir /ssd1/users/mino/dataset/NTU_RGB+D_processed/ --batch-size 8 --num-workers 16 --output-dir /volume3/users/mino/data/VIAR/features/ --checkpoint-files '{"all": "/volume3/users/mino/data/VIAR/VIAR_Jul23_06-16-38/checkpoints/VIAR_Jul23_06-16-38_Ep362_Iter1824480.tar"}'
```
### Observation
![TSNE Result on Setup number 1 and Replication number 1](/assets/figure.png?raw=true "TSNE Result")
Figure above is TSNE result on extracted ConvLSTM features from videos with Setup number 1 and Replication number 1 (540 videos, 6 frames or dots per video, see ```/assets/figure.pdf``` for full resolution image. Blue groups are Camera number 1Orange groups are Camera number 2, green groups are Camera number 3.
I personally expected to get TSNE result that shows 'view-invariant' property of these features, but TSNE does not seem to be the best way to see such property.
### References
1. J. Li, Y. Wong, Q. Zhao, and M. S. Kankanhalli, “Unsupervised Learning of View-invariant Action Representations", NeurIPS, 2018
2. PD-Flow: https://github.com/MarianoJT88/PD-Flow # modified version included in this repo
2. Convolution RNN: https://github.com/kamo-naoyuki/pytorch_convolutional_rnn # included in this repo
3. Gradient Reversal Layer: https://github.com/janfreyberg/pytorch-revgrad # modified version included in this repo
Binary file added assets/dataset.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/figure.pdf
Binary file not shown.
Binary file added assets/figure.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit bce1fcd

Please sign in to comment.