-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
1c01877
commit 3cf8490
Showing
8 changed files
with
133 additions
and
171 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,9 @@ | ||
|
||
<a name="readme-top"></a> | ||
# Temporal Logic Video (TLV) Dataset | ||
|
||
[![Contributors][contributors-shield]][contributors-url] | ||
[![Forks][forks-shield]][forks-url] | ||
[![Stargazers][stars-shield]][stars-url] | ||
[![MIT License][license-shield]][license-url] | ||
[![LinkedIn][linkedin-shield]][linkedin-url] | ||
|
||
<!-- PROJECT LOGO --> | ||
<br /> | ||
<div align="center"> | ||
|
@@ -22,192 +19,162 @@ | |
<a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset"><strong>Explore the docs »</strong></a> | ||
<br /> | ||
<br /> | ||
<a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset">View Demo</a> | ||
· | ||
<a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/issues">Report Bug</a> | ||
<a href="https://anoymousu1.github.io/nsvs-anonymous.github.io/">NSVS-TL Project Webpage</a> | ||
· | ||
<a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset">Request Feature</a> | ||
<a href="https://github.com/UTAustin-SwarmLab/Neuro-Symbolic-Video-Search-Temploral-Logic">NSVS-TL Source Code</a> | ||
</p> | ||
</div> | ||
|
||
## Overview | ||
|
||
<!-- TABLE OF CONTENTS --> | ||
<details> | ||
<summary>Table of Contents</summary> | ||
<ol> | ||
<li><a href="#about-the-project">About The Project</a></li> | ||
<li> | ||
<a href="#getting-started">Getting Started</a> | ||
<ul> | ||
<li><a href="#prerequisites">Prerequisites</a></li> | ||
<li><a href="#installation">Installation</a></li> | ||
</ul> | ||
</li> | ||
<li><a href="#usage">Usage</a></li> | ||
<li><a href="#roadmap">Roadmap</a></li> | ||
<li><a href="#contributing">Contributing</a></li> | ||
<li><a href="#license">License</a></li> | ||
<li><a href="#contact">Contact</a></li> | ||
<li><a href="#acknowledgments">Acknowledgments</a></li> | ||
</ol> | ||
</details> | ||
|
||
<!-- ABOUT THE PROJECT --> | ||
## About The Project | ||
|
||
<!-- [![Product Name Screen Shot][product-screenshot]](https://example.com) --> | ||
|
||
Given the lack of SOTA video datasets for long-horizon, | ||
temporally extended activity and object detection, we intro- | ||
duce the Temporal Logic Video (TLV) datasets. The syn- | ||
thetic TLV datasets are compiled by stitching together static | ||
images from computer vision datasets like COCO and | ||
ImageNet. This enables the artificial introduction of | ||
a wide range of TL specifications. Additionally, we have | ||
created two video datasets based on the open-source au- | ||
tonomous vehicle (AV) driving datasets NuScenes and | ||
Waymo. | ||
|
||
<p align="right">(<a href="#readme-top">back to top</a>)</p> | ||
|
||
<!-- GETTING STARTED --> | ||
## Getting Started | ||
|
||
This is an example of how you may give instructions on setting up your project locally. | ||
To get a local copy up and running follow these simple example steps. | ||
The Temporal Logic Video (TLV) Dataset addresses the scarcity of state-of-the-art video datasets for long-horizon, temporally extended activity and object detection. It comprises two main components: | ||
|
||
### Prerequisites | ||
1. Synthetic datasets: Generated by concatenating static images from established computer vision datasets (COCO and ImageNet), allowing for the introduction of a wide range of Temporal Logic (TL) specifications. | ||
2. Real-world datasets: Based on open-source autonomous vehicle (AV) driving datasets, specifically NuScenes and Waymo. | ||
|
||
If you want to generate syntetic dataset from COCO and ImageNet, you should download the source data first. | ||
## Table of Contents | ||
|
||
1. [ImageNet](https://image-net.org/challenges/LSVRC/2017/index.php): The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2017. Recommended file structure as follows: | ||
``` | ||
|--ILSVRC | ||
|----Annotations | ||
|----Data | ||
|----ImageSets | ||
|----LOC_synset_mapping.txt | ||
``` | ||
- [Dataset Composition](#dataset-composition) | ||
- [Dataset (Release)](#dataset) | ||
- [Installation](#installation) | ||
- [Usage](#usage) | ||
- [Data Generation](#data-generation) | ||
- [Contribution Guidelines](#contribution-guidelines) | ||
- [License](#license) | ||
- [Acknowledgments](#acknowledgments) | ||
|
||
2. [COCO](https://cocodataset.org/#download): Download the source data as follow: | ||
``` | ||
|--COCO | ||
|----2017 | ||
|------annotations | ||
|------train2017 | ||
|------val2017 | ||
``` | ||
## Dataset Composition | ||
|
||
### Installation | ||
``` | ||
python -m venv .venv | ||
source .venv/bin/activate | ||
python -m pip install --upgrade pip build | ||
python -m pip install --editable ."[dev, test]" | ||
``` | ||
|
||
<p align="right">(<a href="#readme-top">back to top</a>)</p> | ||
### Synthetic Datasets | ||
- Source: COCO and ImageNet | ||
- Purpose: Introduce artificial Temporal Logic specifications | ||
- Generation Method: Image stitching from static datasets | ||
|
||
<!-- USAGE EXAMPLES --> | ||
## Usage | ||
Please find argument details from run scripts. | ||
|
||
### Data Loader Common Argument | ||
* `data_root_dir`: The root directory where the COCO dataset is stored. | ||
* `mapping_to`: Map the original label to desired mapper, default is "coco". | ||
* `save_dir`: Directory where the generated dataset will be saved. | ||
### Synthetic Generator Common Argument | ||
* `initial_number_of_frame`: Initial number of frames for each video. | ||
* `max_number_frame`: Maximum number of frames for each video. | ||
* `number_video_per_set_of_frame`: Number of videos to generate per set of frames. | ||
* `increase_rate`: Rate at which the number of frames increases. | ||
* `ltl_logic`: Temporal logic to apply. Options include various logical expressions like "F prop1", "G prop1", etc. | ||
* `save_images`: Boolean to decide whether to save individual frame images (True or False). | ||
|
||
In each run script, make sure | ||
1. **coco synthetic data generator** <br> | ||
COCO synthetic data generator can generate & compositions since it has multiple labels. | ||
``` | ||
python3 run_scripts/run_synthetic_tlv_coco.py --data_root_dir "../COCO/2017" --save_dir "<output dir path>" | ||
``` | ||
|
||
2. **Imagenet synthetic data generator** <br> | ||
Imagenet synthetic data generator cannot generate & LTL logic formula. | ||
``` | ||
python3 run_synthetic_tlv_imagenet.py --data_root_dir "../ILSVRC" --save_dir "<output dir path>"" | ||
``` | ||
|
||
<p align="right">(<a href="#readme-top">back to top</a>)</p> | ||
### Real-world Datasets | ||
- Sources: NuScenes and Waymo | ||
- Purpose: Provide real-world autonomous vehicle scenarios | ||
- Annotation: Temporal Logic specifications added to existing data | ||
|
||
## Dataset | ||
<div align="center"> | ||
<a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset"> | ||
<img src="images/teaser.png" alt="Logo" width="240" height="240"> | ||
</a> | ||
</div> | ||
Though we provide a source code to generate datasets from different types of data source, we release a v1 dataset as a proof of concept. | ||
|
||
### Dataset Structure | ||
|
||
<!-- ROADMAP --> | ||
## Roadmap | ||
We provide a v1 dataset as a proof of concept. The data is offered as serialized objects, each containing a set of frames with annotations. | ||
|
||
- [ ] Publication | ||
- [ ] Repository | ||
- [ ] Blog | ||
#### File Naming Convention | ||
`\<tlv_data_type\>:source:\<datasource\>-number_of_frames:\<number_of_frames\>-\<uuid\>.pkl` | ||
|
||
<p align="right">(<a href="#readme-top">back to top</a>)</p> | ||
#### Object Attributes | ||
Each serialized object contains the following attributes: | ||
- `ground_truth`: Boolean indicating whether the dataset contains ground truth labels | ||
- `ltl_formula`: Temporal logic formula applied to the dataset | ||
- `proposition`: A set of proposition for ltl_formula | ||
- `number_of_frame`: Total number of frames in the dataset | ||
- `frames_of_interest`: Frames of interest which satisfy the ltl_formula | ||
- `labels_of_frames`: Labels for each frame | ||
- `images_of_frames`: Image data for each frame | ||
|
||
<!-- CONTRIBUTING | ||
## Contributing | ||
You can download a dataset from here. The structure of dataset is as follows: serializer | ||
``` | ||
ILSVRC/ | ||
├── Annotations/ | ||
├── Data/ | ||
├── ImageSets/ | ||
└── LOC_synset_mapping.txt | ||
``` | ||
|
||
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are **greatly appreciated**. | ||
## Installation | ||
|
||
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". | ||
Don't forget to give the project a star! Thanks again! | ||
```bash | ||
python -m venv .venv | ||
source .venv/bin/activate | ||
python -m pip install --upgrade pip build | ||
python -m pip install --editable ."[dev, test]" | ||
``` | ||
|
||
1. Fork the Project | ||
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`) | ||
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`) | ||
4. Push to the Branch (`git push origin feature/AmazingFeature`) | ||
5. Open a Pull Request | ||
### Prerequisites | ||
|
||
<p align="right">(<a href="#readme-top">back to top</a>)</p> | ||
--> | ||
1. ImageNet (ILSVRC 2017): | ||
``` | ||
ILSVRC/ | ||
├── Annotations/ | ||
├── Data/ | ||
├── ImageSets/ | ||
└── LOC_synset_mapping.txt | ||
``` | ||
|
||
2. COCO (2017): | ||
``` | ||
COCO/ | ||
└── 2017/ | ||
├── annotations/ | ||
├── train2017/ | ||
└── val2017/ | ||
``` | ||
|
||
## Usage | ||
|
||
<!-- LICENSE --> | ||
## License | ||
Detailed usage instructions for data loading and processing. | ||
|
||
Distributed under the MIT License. See `LICENSE` for more information. | ||
### Data Loader Configuration | ||
|
||
<p align="right">(<a href="#readme-top">back to top</a>)</p> | ||
- `data_root_dir`: Root directory of the dataset | ||
- `mapping_to`: Label mapping scheme (default: "coco") | ||
- `save_dir`: Output directory for processed data | ||
|
||
<!-- CONTACT --> | ||
## Contact | ||
### Synthetic Data Generator Configuration | ||
|
||
Minkyu Choi - [@your_twitter](https://twitter.com/MinkyuChoi7) - [email protected] | ||
- `initial_number_of_frame`: Starting frame count per video | ||
- `max_number_frame`: Maximum frame count per video | ||
- `number_video_per_set_of_frame`: Videos to generate per frame set | ||
- `increase_rate`: Frame count increment rate | ||
- `ltl_logic`: Temporal Logic specification (e.g., "F prop1", "G prop1") | ||
- `save_images`: Boolean flag for saving individual frames | ||
|
||
Project Link: TBD | ||
## Data Generation | ||
|
||
<p align="right">(<a href="#readme-top">back to top</a>)</p> | ||
### COCO Synthetic Data Generation | ||
|
||
```bash | ||
python3 run_scripts/run_synthetic_tlv_coco.py --data_root_dir "../COCO/2017" --save_dir "<output_dir>" | ||
``` | ||
|
||
### ImageNet Synthetic Data Generation | ||
|
||
<!-- ACKNOWLEDGMENTS --> | ||
## Acknowledgments | ||
```bash | ||
python3 run_synthetic_tlv_imagenet.py --data_root_dir "../ILSVRC" --save_dir "<output_dir>" | ||
``` | ||
|
||
* University of Texas at Austin (UT Austin) | ||
* UT Austin Swarm Lab | ||
Note: ImageNet generator does not support '&' LTL logic formulae. | ||
|
||
<p align="right">(<a href="#readme-top">back to top</a>)</p> | ||
## License | ||
|
||
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details. | ||
|
||
## Citation | ||
If you find this repo useful, please cite our paper: | ||
```bibtex | ||
@inproceedings{Choi_2024_ECCV, | ||
author={Choi, Minkyu and Goel Harsh and Omama, Mohammad and Yang, Yunhao and Shah, Sahil and Chinchali and Sandeep}, | ||
title={Towards Neuro-Symbolic Video Understanding}, | ||
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)}, | ||
month={September}, | ||
year={2024} | ||
} | ||
``` | ||
|
||
|
||
<!-- MARKDOWN LINKS & IMAGES --> | ||
<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links --> | ||
[contributors-shield]: https://img.shields.io/github/contributors/othneildrew/Best-README-Template.svg?style=for-the-badge | ||
[contributors-shield]: https://img.shields.io/github/contributors/UTAustin-SwarmLab/temporal-logic-video-dataset.svg?style=for-the-badge | ||
[contributors-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/graphs/contributors | ||
[forks-shield]: https://img.shields.io/github/forks/othneildrew/Best-README-Template.svg?style=for-the-badge | ||
[forks-shield]: https://img.shields.io/github/forks/UTAustin-SwarmLab/temporal-logic-video-dataset.svg?style=for-the-badge | ||
[forks-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/network/members | ||
[stars-shield]: https://img.shields.io/github/stars/othneildrew/Best-README-Template.svg?style=for-the-badge | ||
[stars-shield]: https://img.shields.io/github/stars/UTAustin-SwarmLab/temporal-logic-video-dataset.svg?style=for-the-badge | ||
[stars-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/stargazers | ||
[issues-shield]: https://img.shields.io/github/issues/othneildrew/Best-README-Template.svg?style=for-the-badge | ||
[issues-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/issues | ||
[license-shield]: https://img.shields.io/github/license/othneildrew/Best-README-Template.svg?style=for-the-badge | ||
[license-shield]: https://img.shields.io/github/license/UTAustin-SwarmLab/temporal-logic-video-dataset.svg?style=for-the-badge | ||
[license-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/blob/master/LICENSE.txt | ||
[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555 | ||
[linkedin-url]: https://www.linkedin.com/in/mchoi07/ | ||
[product-screenshot]: images/screenshot.png |
This file was deleted.
Oops, something went wrong.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters