Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
minkyu-choi07 committed Jul 6, 2024
1 parent 1c01877 commit 3cf8490
Show file tree
Hide file tree
Showing 8 changed files with 133 additions and 171 deletions.
271 changes: 119 additions & 152 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@

<a name="readme-top"></a>
# Temporal Logic Video (TLV) Dataset

[![Contributors][contributors-shield]][contributors-url]
[![Forks][forks-shield]][forks-url]
[![Stargazers][stars-shield]][stars-url]
[![MIT License][license-shield]][license-url]
[![LinkedIn][linkedin-shield]][linkedin-url]

<!-- PROJECT LOGO -->
<br />
<div align="center">
Expand All @@ -22,192 +19,162 @@
<a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset"><strong>Explore the docs »</strong></a>
<br />
<br />
<a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset">View Demo</a>
·
<a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/issues">Report Bug</a>
<a href="https://anoymousu1.github.io/nsvs-anonymous.github.io/">NSVS-TL Project Webpage</a>
·
<a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset">Request Feature</a>
<a href="https://github.com/UTAustin-SwarmLab/Neuro-Symbolic-Video-Search-Temploral-Logic">NSVS-TL Source Code</a>
</p>
</div>

## Overview

<!-- TABLE OF CONTENTS -->
<details>
<summary>Table of Contents</summary>
<ol>
<li><a href="#about-the-project">About The Project</a></li>
<li>
<a href="#getting-started">Getting Started</a>
<ul>
<li><a href="#prerequisites">Prerequisites</a></li>
<li><a href="#installation">Installation</a></li>
</ul>
</li>
<li><a href="#usage">Usage</a></li>
<li><a href="#roadmap">Roadmap</a></li>
<li><a href="#contributing">Contributing</a></li>
<li><a href="#license">License</a></li>
<li><a href="#contact">Contact</a></li>
<li><a href="#acknowledgments">Acknowledgments</a></li>
</ol>
</details>

<!-- ABOUT THE PROJECT -->
## About The Project

<!-- [![Product Name Screen Shot][product-screenshot]](https://example.com) -->

Given the lack of SOTA video datasets for long-horizon,
temporally extended activity and object detection, we intro-
duce the Temporal Logic Video (TLV) datasets. The syn-
thetic TLV datasets are compiled by stitching together static
images from computer vision datasets like COCO and
ImageNet. This enables the artificial introduction of
a wide range of TL specifications. Additionally, we have
created two video datasets based on the open-source au-
tonomous vehicle (AV) driving datasets NuScenes and
Waymo.

<p align="right">(<a href="#readme-top">back to top</a>)</p>

<!-- GETTING STARTED -->
## Getting Started

This is an example of how you may give instructions on setting up your project locally.
To get a local copy up and running follow these simple example steps.
The Temporal Logic Video (TLV) Dataset addresses the scarcity of state-of-the-art video datasets for long-horizon, temporally extended activity and object detection. It comprises two main components:

### Prerequisites
1. Synthetic datasets: Generated by concatenating static images from established computer vision datasets (COCO and ImageNet), allowing for the introduction of a wide range of Temporal Logic (TL) specifications.
2. Real-world datasets: Based on open-source autonomous vehicle (AV) driving datasets, specifically NuScenes and Waymo.

If you want to generate syntetic dataset from COCO and ImageNet, you should download the source data first.
## Table of Contents

1. [ImageNet](https://image-net.org/challenges/LSVRC/2017/index.php): The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2017. Recommended file structure as follows:
```
|--ILSVRC
|----Annotations
|----Data
|----ImageSets
|----LOC_synset_mapping.txt
```
- [Dataset Composition](#dataset-composition)
- [Dataset (Release)](#dataset)
- [Installation](#installation)
- [Usage](#usage)
- [Data Generation](#data-generation)
- [Contribution Guidelines](#contribution-guidelines)
- [License](#license)
- [Acknowledgments](#acknowledgments)

2. [COCO](https://cocodataset.org/#download): Download the source data as follow:
```
|--COCO
|----2017
|------annotations
|------train2017
|------val2017
```
## Dataset Composition

### Installation
```
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip build
python -m pip install --editable ."[dev, test]"
```

<p align="right">(<a href="#readme-top">back to top</a>)</p>
### Synthetic Datasets
- Source: COCO and ImageNet
- Purpose: Introduce artificial Temporal Logic specifications
- Generation Method: Image stitching from static datasets

<!-- USAGE EXAMPLES -->
## Usage
Please find argument details from run scripts.

### Data Loader Common Argument
* `data_root_dir`: The root directory where the COCO dataset is stored.
* `mapping_to`: Map the original label to desired mapper, default is "coco".
* `save_dir`: Directory where the generated dataset will be saved.
### Synthetic Generator Common Argument
* `initial_number_of_frame`: Initial number of frames for each video.
* `max_number_frame`: Maximum number of frames for each video.
* `number_video_per_set_of_frame`: Number of videos to generate per set of frames.
* `increase_rate`: Rate at which the number of frames increases.
* `ltl_logic`: Temporal logic to apply. Options include various logical expressions like "F prop1", "G prop1", etc.
* `save_images`: Boolean to decide whether to save individual frame images (True or False).

In each run script, make sure
1. **coco synthetic data generator** <br>
COCO synthetic data generator can generate & compositions since it has multiple labels.
```
python3 run_scripts/run_synthetic_tlv_coco.py --data_root_dir "../COCO/2017" --save_dir "<output dir path>"
```

2. **Imagenet synthetic data generator** <br>
Imagenet synthetic data generator cannot generate & LTL logic formula.
```
python3 run_synthetic_tlv_imagenet.py --data_root_dir "../ILSVRC" --save_dir "<output dir path>""
```

<p align="right">(<a href="#readme-top">back to top</a>)</p>
### Real-world Datasets
- Sources: NuScenes and Waymo
- Purpose: Provide real-world autonomous vehicle scenarios
- Annotation: Temporal Logic specifications added to existing data

## Dataset
<div align="center">
<a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset">
<img src="images/teaser.png" alt="Logo" width="240" height="240">
</a>
</div>
Though we provide a source code to generate datasets from different types of data source, we release a v1 dataset as a proof of concept.

### Dataset Structure

<!-- ROADMAP -->
## Roadmap
We provide a v1 dataset as a proof of concept. The data is offered as serialized objects, each containing a set of frames with annotations.

- [ ] Publication
- [ ] Repository
- [ ] Blog
#### File Naming Convention
`\<tlv_data_type\>:source:\<datasource\>-number_of_frames:\<number_of_frames\>-\<uuid\>.pkl`

<p align="right">(<a href="#readme-top">back to top</a>)</p>
#### Object Attributes
Each serialized object contains the following attributes:
- `ground_truth`: Boolean indicating whether the dataset contains ground truth labels
- `ltl_formula`: Temporal logic formula applied to the dataset
- `proposition`: A set of proposition for ltl_formula
- `number_of_frame`: Total number of frames in the dataset
- `frames_of_interest`: Frames of interest which satisfy the ltl_formula
- `labels_of_frames`: Labels for each frame
- `images_of_frames`: Image data for each frame

<!-- CONTRIBUTING
## Contributing
You can download a dataset from here. The structure of dataset is as follows: serializer
```
ILSVRC/
├── Annotations/
├── Data/
├── ImageSets/
└── LOC_synset_mapping.txt
```

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are **greatly appreciated**.
## Installation

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".
Don't forget to give the project a star! Thanks again!
```bash
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip build
python -m pip install --editable ."[dev, test]"
```

1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
### Prerequisites

<p align="right">(<a href="#readme-top">back to top</a>)</p>
-->
1. ImageNet (ILSVRC 2017):
```
ILSVRC/
├── Annotations/
├── Data/
├── ImageSets/
└── LOC_synset_mapping.txt
```

2. COCO (2017):
```
COCO/
└── 2017/
├── annotations/
├── train2017/
└── val2017/
```

## Usage

<!-- LICENSE -->
## License
Detailed usage instructions for data loading and processing.

Distributed under the MIT License. See `LICENSE` for more information.
### Data Loader Configuration

<p align="right">(<a href="#readme-top">back to top</a>)</p>
- `data_root_dir`: Root directory of the dataset
- `mapping_to`: Label mapping scheme (default: "coco")
- `save_dir`: Output directory for processed data

<!-- CONTACT -->
## Contact
### Synthetic Data Generator Configuration

Minkyu Choi - [@your_twitter](https://twitter.com/MinkyuChoi7) - [email protected]
- `initial_number_of_frame`: Starting frame count per video
- `max_number_frame`: Maximum frame count per video
- `number_video_per_set_of_frame`: Videos to generate per frame set
- `increase_rate`: Frame count increment rate
- `ltl_logic`: Temporal Logic specification (e.g., "F prop1", "G prop1")
- `save_images`: Boolean flag for saving individual frames

Project Link: TBD
## Data Generation

<p align="right">(<a href="#readme-top">back to top</a>)</p>
### COCO Synthetic Data Generation

```bash
python3 run_scripts/run_synthetic_tlv_coco.py --data_root_dir "../COCO/2017" --save_dir "<output_dir>"
```

### ImageNet Synthetic Data Generation

<!-- ACKNOWLEDGMENTS -->
## Acknowledgments
```bash
python3 run_synthetic_tlv_imagenet.py --data_root_dir "../ILSVRC" --save_dir "<output_dir>"
```

* University of Texas at Austin (UT Austin)
* UT Austin Swarm Lab
Note: ImageNet generator does not support '&' LTL logic formulae.

<p align="right">(<a href="#readme-top">back to top</a>)</p>
## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

## Citation
If you find this repo useful, please cite our paper:
```bibtex
@inproceedings{Choi_2024_ECCV,
author={Choi, Minkyu and Goel Harsh and Omama, Mohammad and Yang, Yunhao and Shah, Sahil and Chinchali and Sandeep},
title={Towards Neuro-Symbolic Video Understanding},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
month={September},
year={2024}
}
```


<!-- MARKDOWN LINKS & IMAGES -->
<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
[contributors-shield]: https://img.shields.io/github/contributors/othneildrew/Best-README-Template.svg?style=for-the-badge
[contributors-shield]: https://img.shields.io/github/contributors/UTAustin-SwarmLab/temporal-logic-video-dataset.svg?style=for-the-badge
[contributors-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/graphs/contributors
[forks-shield]: https://img.shields.io/github/forks/othneildrew/Best-README-Template.svg?style=for-the-badge
[forks-shield]: https://img.shields.io/github/forks/UTAustin-SwarmLab/temporal-logic-video-dataset.svg?style=for-the-badge
[forks-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/network/members
[stars-shield]: https://img.shields.io/github/stars/othneildrew/Best-README-Template.svg?style=for-the-badge
[stars-shield]: https://img.shields.io/github/stars/UTAustin-SwarmLab/temporal-logic-video-dataset.svg?style=for-the-badge
[stars-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/stargazers
[issues-shield]: https://img.shields.io/github/issues/othneildrew/Best-README-Template.svg?style=for-the-badge
[issues-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/issues
[license-shield]: https://img.shields.io/github/license/othneildrew/Best-README-Template.svg?style=for-the-badge
[license-shield]: https://img.shields.io/github/license/UTAustin-SwarmLab/temporal-logic-video-dataset.svg?style=for-the-badge
[license-url]: https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset/blob/master/LICENSE.txt
[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555
[linkedin-url]: https://www.linkedin.com/in/mchoi07/
[product-screenshot]: images/screenshot.png
6 changes: 0 additions & 6 deletions example.py

This file was deleted.

Binary file modified images/teaser.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 0 additions & 1 deletion tests/__init__.py

This file was deleted.

1 change: 0 additions & 1 deletion tests/subpackage_name/__init__.py

This file was deleted.

1 change: 0 additions & 1 deletion tests/subpackage_name/test_subpackage_module_name.py

This file was deleted.

1 change: 0 additions & 1 deletion tests/test_module_name.py

This file was deleted.

23 changes: 14 additions & 9 deletions tlv_dataset/loader/coco.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,12 @@ def load_data(self):
"""
img_ids = self._coco.getImgIds()
images = [
self._image_dir / self._coco.loadImgs(id)[0]["file_name"] for id in img_ids
self._image_dir / self._coco.loadImgs(id)[0]["file_name"]
for id in img_ids
]
annotations = [
self._coco.loadAnns(self._coco.getAnnIds(imgIds=id)) for id in img_ids
self._coco.loadAnns(self._coco.getAnnIds(imgIds=id))
for id in img_ids
]

return images, annotations
Expand All @@ -76,16 +78,19 @@ def process_data(self) -> TLVRawImage:
for id in img_ids:
images.append(
cv2.imread(
str(self._image_dir / self._coco.loadImgs(id)[0]["file_name"])
str(
self._image_dir
/ self._coco.loadImgs(id)[0]["file_name"]
)
)[:, :, ::-1]
) # Read it as RGB
annotation = self._coco.loadAnns(self._coco.getAnnIds(imgIds=id))
labels_per_image = []
for i in range(len(annotation)):
labels_per_image.append(
self._coco.cats[annotation[i]["category_id"]]["name"].replace(
" ", "_"
)
self._coco.cats[annotation[i]["category_id"]][
"name"
].replace(" ", "_")
)
unique_labels = list(set(labels_per_image))
if len(unique_labels) == 0:
Expand Down Expand Up @@ -133,10 +138,10 @@ def map_data(self, **kwargs) -> any:

# # Example usage:
# coco_loader = COCOImageLoader(
# coco_dir_path="/opt/Neuro-Symbolic-Video-Frame-Search/artifacts/data/benchmark_image_dataset/coco",
# annotation_file="annotations/instances_val2017.json",
# image_dir="val2017",
# coco_root_dir_path="/store/datasets/COCO/2017",
# coco_image_source="val",
# )
# breakpoint()

# # Display a sample image
# coco_loader.display_sample_image(0)

0 comments on commit 3cf8490

Please sign in to comment.