- CUDA
- CUDNN
- Python 3.10
- PyTorch 2.0.0
- NNCore
- ViT-PyTorch
- Clone the repository from GitHub
git clone https://github.com/Skyline-9/Visionary-Vids.git
cd Visionary-Vids
- Install dependencies
Using the shell script (conda required)
sh environment/init_conda.sh
Using conda
conda env create -f environment/environment.yml
conda activate VisionaryVids
Using pip
pip install -r environment/requirements.txt
- Setup automatic code styling
pre-commit install
- Download and extract the datasets.
- Prepare the files in the following structure.
Visionary-Vids
├── environment
├── configs
├── datasets
├── models
├── data
│ ├── qvhighlights
│ │ ├── *features
│ │ ├── highlight_{train,val,test}_release.jsonl
│ │ └── subs_train.jsonl
│ ├── charades
│ │ ├── *features
│ │ └── charades_sta_{train,test}.txt
│ ├── youtube
│ │ ├── *features
│ │ └── youtube_anno.json
│ └── tvsum
│ ├── *features
│ └── tvsum_anno.json
├── README.md
├── setup.cfg
├── launch.py
└── ···
Run the following command to train a model using a specified config.
# Single GPU
python launch.py ${path-to-config}
# Multiple GPUs
torchrun --nproc_per_node=${num-gpus} launch.py ${path-to-config}
# Train from checkpoint
python launch.py ${path-to-config} --checkpoint ${path-to-checkpoint}
Run the following command to test a model and evaluate results.
python launch.py ${path-to-config} --checkpoint ${path-to-checkpoint} --eval