Skip to content

Latest commit

 

History

History
65 lines (42 loc) · 2.73 KB

README.md

File metadata and controls

65 lines (42 loc) · 2.73 KB

🏘️ Airbert: In-domain Pretraining for Vision-and-Language Navigation 🏘️

MIT arXiv R2R 1st ICCV 2021 website

This repository stores some models trained in our experiments.

⌨️ Downloading from the command line

We stored our models on Google Drive, as the service provides a storage of 15 GB for free.

You can use the great gdown script for downloading the models:

pip install gdown
gdown [link to Google Drive]

We also provide a Makefile to help you:

# Download everything
make all 
# Download a specific model
make airbert-r2rRSA
# Get all commands
make help

🏘️ Model pretrained on the BNB dataset

Model Description
airbert Airbert model pretrained on the BNB dataset

👽 External models used in our scripts

Model Description
vilbert ViLBERT model pretrained on Conceptual Captions
vlnbert VLN-BERT: vilbert fine-tuned on R2R

🤖 Finetuned models in discriminative setting

Model Description
airbert-r2rRS airbert fine-tuned on R2R with the shuffling loss
airbert-r2rRSA airbert fine-tuned on R2R with the shuffling loss + speaker data

🤖 Finetuned models in generative setting

Model Description
REVERIE Recurrent VLN-BERT for remote referring expression with pretrained Airbert as backbone
R2R Recurrent VLN-BERT for vision-and-language navigation with pretrained Airbert as backbone