Video Feature Extractor

Description

This feature extracts video features via an action recognition model in GluonCV, audio featues usins panns_interfernce, Mae features VideoMAE, ASR features using Whisper for transcriptions and Bert for tokenization, and CLIP features using using CLIP. These features will be used to train an audio descriptive captioning moddel.

Setup

The following step will require Conda installed. Run the following to create the Conda environment with all the dependencies:

conda env create -f ENV.yml

How to use

Run the below command to run the extractor with the videos set to the directory path of of the videos and output to where you want the features stored

python full_extraction.py --videos=video_path --output=output_path

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
CLIP		CLIP
__holder__		__holder__
audio		audio
mae		mae
utils		utils
video		video
whispers		whispers
ENV.yml		ENV.yml
README.md		README.md
full_extraction.py		full_extraction.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Feature Extractor

Description

Setup

How to use

About

Releases

Packages

Languages

kevinjcai/Feature_Extractor

Folders and files

Latest commit

History

Repository files navigation

Video Feature Extractor

Description

Setup

How to use

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages