Skip to content

Commit

Permalink
creating path.py and fixig img generation problems
Browse files Browse the repository at this point in the history
  • Loading branch information
mohamed kamel committed Apr 9, 2022
0 parents commit d2e8087
Show file tree
Hide file tree
Showing 44 changed files with 2,841 additions and 0 deletions.
9 changes: 9 additions & 0 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Microsoft Open Source Code of Conduct

This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).

Resources:

- [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/)
- [Microsoft Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
- Contact [[email protected]](mailto:[email protected]) with questions or concerns
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) Microsoft Corporation.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE
152 changes: 152 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# Learning Visuomotor Policies for Aerial Navigation Using Cross-Modal Representations

![Teaser](figs/giphy.gif)

This repository provides a code base to evaluate and train models from the paper "*Learning Visuomotor Policies for Aerial Navigation Using Cross-Modal Representations*".

ArXiv pre-print: [https://arxiv.org/abs/1909.06993](https://arxiv.org/abs/1909.06993)

Paper video: https://youtu.be/VKc3A5HlUU8

## License and Citation
This project is licensed under the terms of the MIT license. By using the software, you are agreeing to the terms of the [license agreement](LICENSE).

If you use this code in your research, please cite us as follows:

```
@article{bonatti2020learning,
title={Learning Visuomotor Policies for Aerial Navigation Using Cross-Modal Representations},
author={Bonatti, Rogerio and Madaan, Ratnesh and Vineet, Vibhav and Scherer, Sebastian and Kapoor, Ashish},
journal={arXiv preprint arXiv:1909.06993},
year={2020}
}
```

## Recommended system
Recommended system (tested):
- Ubuntu 18.04
- Python 2.7.15

Python packages used by the example provided and their recommended version:
- [airsimdroneracingvae](https://pypi.org/project/airsimdroneracingvae/)==1.0.0
- tensorflow==2.0.0-beta1
- msgpack-rpc-python==0.4.1
- numpy==1.16.4
- matplotlib==2.1.1
- scikit-learn==0.20.4
- scipy==1.2.2
- pandas==0.24.2

## Downloading the drone racing files
In order for you to train the models and run Airsim you first need to download all image datasets, behavior cloning datasets, network weights and Airsim binaries:
- Download all files and datasets [Drone Racing files v. 1.0](https://drive.google.com/drive/folders/1NKk_qmLhBW-coqouHrRBPgUkvV-GntSd?usp=sharing)
- Extract all individual files in the folders
- Place the `settings.json` file inside `~/Documents/AirSim` in your computer

## Training and testing the cross-modal VAE representation
In order to train the cross-modal representations you can either use the downloaded image dataset from the previous step, or generate the data yourself using Airsim.

![Teaser](figs/arch.png)

### Training with downloaded dataset

- Go to folder `cmvae`, and inside file `train_cmvae.py` edit variable `data_dir` to the correct path of the extracted dataset within your computer. The default value is the directory with 1K images. But for final training you will need more images, such as the 50K or 300K datasets
- Also, edit variable `output_dir` to the correct place where you want the models to be saved
- Run

```
train_cmvae.py
```

- Network weights will be saved every 5 epochs by default, and you can check loss values with tensorboard or by looking at the terminal
- Once the network is trained you can evaluate it using another script, which will automatically plot histograms of errors, image reconstructions and latent space interpolations:
```
eval_cmvae.py
```

### Generating your own dataset with Airsim
You may want to generate a custom dataset for training you cross-modal VAE. Here are the steps to do it:

- Start the Airsim environment from the binary file:
```
$ cd /yourpath/all_files/airsim_binaries/vae_env
$ ./AirSimExe.sh -windowed
```
- If it asks if you want the car model, click `No`
- Inside the file `datagen/img_generator/main.py` first change the desired number of samples and saved dataset path
- Run the script for generating data:
```
main.py # inside datagen/img_generator
```
- Once the dataset is generated, follow the previous scripts for training the CM-VAE


## Generating imitation learning data for racing
In order to train the behavior cloning networks you can either use the downloaded image-action pairs dataset or generate the data yourself using Airsim.

![Teaser](figs/process_low.png)

### Training with downloaded dataset

- Go to folder `imitation_learning`, and inside file `train_bc.py` edit variables `base_path`, `data_dir_list`, and `output_dir`. By default you will be using downloaded datasets with 0m to 3m of random gate displacement amplitude over a course with 8m of nominal radius
- Edit the variables relative to the training mode (full end-to-end, latent representation or regression as latent representation) and weights path for the latent representations (not applicable for full end-to-end learning)
- Run the script for training the behavior cloning policies:
```
train_bc.py
```

### Generating your own imitation learning dataset with Airsim
You may want to generate a custom dataset for training you behavior cloning policies. Here are the steps to do it:

- Start the Airsim environment from the binary file (not the same one for generating images for the cross-modal representation!):
```
$ cd /yourpath/all_files/airsim_binaries/recording_env
$ ./AirSimExe.sh -windowed
```
- Inside the file `datagen/action_generator/src/soccer_datagen.py` change the desired meta-parameters (number of gates, track radius, gate displacement noise, etc)
- Run the script for generating data:
```
soccer_datagen.py
```
- Once you're satisfied with the motion, turn off trajectory visualization parameter `viz_traj`. Otherwise the recorded images will show the motion line
- Once the quad is flying, press `r` on your keyboard to start recording images. Velocities will be automatically recorded. Both are saved inside `~/Documents/AirSim`

Now you`ll need to process the raw recording so that you can match the time-stamps from velocity commands and images into a cohesive dataset. To do it:

- Inside `/Documents/AirSim`, copy the contents of both folders (`moveOnSpline_vel_cmd.txt`, `images` folder and `images.txt` file) into a new directory, for example `/all_files/il_datasets/bc_test`.
- In `datagen/action_generator/src/data_processor.py`, modify variable `base_path` to `/all_files/il_datasets/bc_test`. Then run:
```
data_processor.py
```
- Finally, can train `train_bc.py` following the previous steps. You can combine different datasets with different noise levels to train the same policy

## Deploying the trained policies
Now you can deploy the trained policies in AirSim, following these steps:
- Start the Airsim environment from correct binary file:
```
$ cd /yourpath/all_files/airsim_binaries/vae_env
$ ./AirSimExe.sh -windowed
```
- In file `imitation_learning/bc_navigation.py`, modify `policy_type` and `gate_noise`. Then run:
```
bc_navigation.py
```

The policies trained in AirSim using the cross-modal representations can transferred directly towards real-world applications. Please check out the paper and video to see more results from real-life deployment.

![](figs/main_lowres.png)

# Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [[email protected]](mailto:[email protected]) with any additional questions or comments.

Binary file added cmvae/__pycache__/train_cmvae.cpython-39.pyc
Binary file not shown.
175 changes: 175 additions & 0 deletions cmvae/eval_cmvae.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
import tensorflow as tf
import os
import sys
import matplotlib.pyplot as plt
import numpy as np

# from train_cmvae import gp_data_dir, gp_output_dir, ppr_data_dir, ppr_output_dir
# imports
curr_dir = os.path.dirname(os.path.abspath(__file__))
import_path = os.path.join(curr_dir, '..')
sys.path.insert(0, import_path)
import racing_models.cmvae
import racing_utils
from racing_utils.paths import *
# DEFINE TESTING META PARAMETERS
data_dir = gp_img_data_dir
read_table = True
latent_space_constraints = True
weights_path = ppr_cmvae_output_dir + "/cmvae_model_40.ckpt"

n_z = 10
img_res = 64
num_imgs_display = 50
columns = 10
rows = 10

num_interp_z = 10
idx_close = 0 #7
idx_far = 1 #39

z_range_mural = [-0.02, 0.02]
z_num_mural = 11

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'
# 0 = all messages are logged (default behavior)
# 1 = INFO messages are not printed
# 2 = INFO and WARNING messages are not printed
# 3 = INFO, WARNING, and ERROR messages are not printed

# allow growth is possible using an env var in tf2.0
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'

# Load test dataset
images_np, raw_table = racing_utils.dataset_utils.create_test_dataset_csv(data_dir, img_res, read_table=read_table)
print('Done with dataset')

images_np = images_np[:1000,:]
if read_table is True:
raw_table = raw_table[:1000,:]

# create model
if latent_space_constraints is True:
model = racing_models.cmvae.CmvaeDirect(n_z=n_z, gate_dim=4, res=img_res, trainable_model=True)
else:
model = racing_models.cmvae.Cmvae(n_z=n_z, gate_dim=4, res=img_res, trainable_model=True)

model.load_weights(weights_path)

img_recon, gate_recon, means, stddev, z = model(images_np, mode=0)
img_recon = img_recon.numpy()
gate_recon = gate_recon.numpy()
z = z.numpy()

# de-normalization of gates and images
images_np = ((images_np + 1.0) / 2.0 * 255.0).astype(np.uint8)
img_recon = ((img_recon + 1.0) / 2.0 * 255.0).astype(np.uint8)
gate_recon = racing_utils.dataset_utils.de_normalize_gate(gate_recon)

# if not read_table:
# sys.exit()

# get stats for gate reconstruction
if read_table is True:
racing_utils.stats_utils.calculate_gate_stats(gate_recon, raw_table)

# show some reconstruction figures
fig = plt.figure(figsize=(20, 20))
for i in range(1, num_imgs_display+1):
idx_orig = (i-1)*2+1
fig.add_subplot(rows, columns, idx_orig)
img_display = racing_utils.dataset_utils.convert_bgr2rgb(images_np[i - 1, :])
plt.axis('off')
plt.imshow(img_display)
fig.add_subplot(rows, columns, idx_orig+1)
img_display = racing_utils.dataset_utils.convert_bgr2rgb(img_recon[i-1, :])
plt.axis('off')
plt.imshow(img_display)
# fig.savefig(os.path.join('D:/ppr_files/figs', 'reconstruction_results.png'))
plt.show()

# show interpolation btw two images in latent space
z_close = z[idx_close, :]
z_far = z[idx_far, :]
z_interp = racing_utils.geom_utils.interp_vector(z_close, z_far, num_interp_z)


# get the image predictions
img_recon_interp, gate_recon_interp = model.decode(z_interp, mode=0)
img_recon_interp = img_recon_interp.numpy()
gate_recon_interp = gate_recon_interp.numpy()

# de-normalization of gates and images
img_recon_interp = ((img_recon_interp + 1.0) / 2.0 * 255.0).astype(np.uint8)
gate_recon_interp = racing_utils.dataset_utils.de_normalize_gate(gate_recon_interp)

# join predictions with array and print
indices = np.array([np.arange(num_interp_z)]).transpose()
results = np.concatenate((indices, gate_recon_interp), axis=1)
print('Img index | Predictions: = \n{}'.format(results))


fig, axs = plt.subplots(1, 4, tight_layout=True)
axs[0].plot(np.arange(gate_recon_interp.shape[0]), gate_recon_interp[:, 0], 'b-', label='r')
axs[1].plot(np.arange(gate_recon_interp.shape[0]), gate_recon_interp[:, 1]*180/np.pi, 'b-', label=r'$\theta$')
axs[2].plot(np.arange(gate_recon_interp.shape[0]), gate_recon_interp[:, 2]*180/np.pi, 'b-', label=r'$\phi$')
axs[3].plot(np.arange(gate_recon_interp.shape[0]), gate_recon_interp[:, 3]*180/np.pi, 'b-', label=r'$\psi$')

for idx in range(4):
# axs[idx].grid()
y_ticks_array = gate_recon_interp[:, idx][np.array([0, gate_recon_interp[:, idx].shape[0]-1])]
y_ticks_array = np.around(y_ticks_array, decimals=1)
if idx > 0:
y_ticks_array = y_ticks_array * 180 / np.pi
axs[idx].set_yticks(y_ticks_array)
axs[idx].set_xticks(np.array([0, 9]))
axs[idx].set_xticklabels((r'$I_a$', r'$I_b$'))

axs[0].set_title(r'$r$')
axs[1].set_title(r'$\theta$')
axs[2].set_title(r'$\phi$')
axs[3].set_title(r'$\psi$')

axs[0].set_ylabel('[meter]')
axs[1].set_ylabel(r'[deg]')
axs[2].set_ylabel(r'[deg]')
axs[3].set_ylabel(r'[deg]')

# plot the interpolated images
fig2 = plt.figure(figsize=(96, 96))
columns = num_interp_z + 2
rows = 1
fig2.add_subplot(rows, columns, 1)
img_display = racing_utils.dataset_utils.convert_bgr2rgb(images_np[idx_close, :])
plt.axis('off')
plt.imshow(img_display)
for i in range(1, num_interp_z + 1):
fig2.add_subplot(rows, columns, i+1)
img_display = racing_utils.dataset_utils.convert_bgr2rgb(img_recon_interp[i - 1, :])
plt.axis('off')
plt.imshow(img_display)
fig2.add_subplot(rows, columns, num_interp_z + 2)
img_display = racing_utils.dataset_utils.convert_bgr2rgb(images_np[idx_far, :])
plt.axis('off')
plt.imshow(img_display)
# fig2.savefig(os.path.join('D:/ppr_files/figs', 'reconstruction_interpolation_results.png'))
plt.show()

# new plot traveling through latent space
fig3 = plt.figure(figsize=(96, 96))
columns = z_num_mural
rows = n_z
z_values = racing_utils.geom_utils.interp_vector(z_range_mural[0], z_range_mural[1], z_num_mural)
for i in range(1, z_num_mural*n_z + 1):
fig3.add_subplot(rows, columns, i)
z = np.zeros((1, n_z)).astype(np.float32)
z[0, (i-1)/columns] = z_values[i%columns-1]
# print (z)
img_recon_interp, gate_recon_interp = model.decode(z, mode=0)
img_recon_interp = img_recon_interp.numpy()
img_recon_interp = ((img_recon_interp[0, :] + 1.0) / 2.0 * 255.0).astype(np.uint8)
img_display = racing_utils.dataset_utils.convert_bgr2rgb(img_recon_interp)
plt.axis('off')
plt.imshow(img_display)
# fig3.savefig(os.path.join('D:/ppr_files/figs', 'z_mural.png'))
plt.show()
Loading

0 comments on commit d2e8087

Please sign in to comment.