Skip to content

Commit

Permalink
Update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
artemisp committed Dec 5, 2023
1 parent 88f0e9c commit 8e85d9e
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 4 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@

## What's New: 🎉
* [Model Release] November 2023, released implementation of **X-InstructBLIP** <br>
[Paper](https://arxiv.org/pdf/2311.18799.pdf), [Project Page](https://github.com/salesforce/LAVIS/tree/main/projects/xinstructblip), [Website](https://artemisp.github.io/X-InstructBLIP-page/)
[Paper](https://arxiv.org/pdf/2311.18799.pdf), [Project Page](https://github.com/salesforce/LAVIS/tree/main/projects/xinstructblip), [Website](https://artemisp.github.io/X-InstructBLIP-page/), [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/salesforce/LAVIS/blob/main/projects/xinstructblip/demo/run_demo.ipynb)
> A simple, yet effective, cross-modality framework built atop frozen LLMs that allows the integration of various modalities (image, video, audio, 3D) without extensive modality-specific customization.
* [Model Release] July 2023, released implementation of **BLIP-Diffusion** <br>
[Paper](https://arxiv.org/abs/2305.06500), [Project Page](https://github.com/salesforce/LAVIS/tree/main/projects/blip-diffusion), [Website](https://dxli94.github.io/BLIP-Diffusion-website/)
Expand Down
22 changes: 19 additions & 3 deletions projects/xinstructblip/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning
[![arXiv](https://img.shields.io/badge/arXiv-1234.56789-b31b1b.svg)]()

[Artemis Panagopoulou](https://artemisp.github.io), [Le Xue](https://www.linkedin.com/in/le-tycho-xue-5abbb9157/), [Ning Yu](https://ningyu1991.github.io/), [Junnan Li](https://sites.google.com/site/junnanlics), [Dongxu Li](https://sites.google.com/view/dongxu-li/home), [Shafiq Joty](https://scholar.google.com/citations?user=hR249csAAAAJ&hl=en&oi=ao), [Ran Xu](https://scholar.google.com/citations?user=sgBB2sUAAAAJ&hl=en), [Silvio Savarese](https://scholar.google.com/citations?user=ImpbxLsAAAAJ&hl=en), [Caiming Xiong](https://scholar.google.com/citations?user=vaSdahkAAAAJ&hl=en), and [Juan Carlos Niebles](https://scholar.google.com/citations?user=hqNhUCYAAAAJ&hl=en)

[![arXiv](https://img.shields.io/badge/arXiv-1234.56789-b31b1b.svg)](https://arxiv.org/pdf/2311.18799.pdf) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/salesforce/LAVIS/blob/main/projects/xinstructblip/demo/run_demo.ipynb)

## Overview

Expand All @@ -11,7 +14,7 @@ X-InstructBLIP a simple yet effective multimodal framework built on top of a fro

### LAVIS Repository
```
git clone https://github.com/artemisp/LAVIS-XInstructBLIP.git # TODO: this should be the X-InstructBLIP branch.
git clone https://github.com/artemisp/LAVIS-XInstructBLIP.git # Once PR accepted change to official LAVIS
cd LAVIS-XInstructBLIP
pip install -e .
```
Expand Down Expand Up @@ -226,4 +229,17 @@ The arguments are as above, with the same audio caption data. Note that you shou
* `rnd`: adds identifier in output files in the case of multiple generations.


## Cite
## Cite

```
@misc{panagopoulou2023xinstructblip,
title={X-InstructBLIP: A Framework for aligning X-Modal instruction-aware
representations to LLMs and Emergent Cross-modal Reasoning},
author={Artemis Panagopoulou and Le Xue and Ning Yu and Junnan Li and Dongxu Li and
Shafiq Joty and Ran Xu and Silvio Savarese and Caiming Xiong and Juan Carlos Niebles},
year={2023},
eprint={2311.18799},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```

0 comments on commit 8e85d9e

Please sign in to comment.