Highlights
- Pro
Pinned Loading
-
TRIAD
TRIAD PublicTri-modal Dense feature grounding between Text-Image-Audio for multimodal prompted localization of features and embedding in a shared space.
Python
-
AV-Align
AV-Align PublicUnsupervised Audio-Visual feature alignment: Repository for audio and visual features embedding onto the same latent space for downstream tasks.
Python
-
LetMeSee
LetMeSee PublicLearned Content aware patch truncation for ViTs for compute efficiency through soft masking during raining.
Python
-
ICPR-RIP
ICPR-RIP Public[ICPR 2024 Competition on Rider Intention Prediction] - [Top Submission] - State-Space Model based sequence modelling of rider's-view videos for intent prediction tasks.
Jupyter Notebook
If the problem persists, check the GitHub status page or contact support.