BioAcoustic Collection Pipeline
This repository aims to streamline the generation and testing of embeddings using a large variety of bioacoustic models.
Create a virtual environment using python3.11 or python3.10 and virtualenv
python3.11 -m virtualenv env_bacpipe
activate the environment
source env_bacpipe/bin/activate
- for
fairseq
to install you will need python headers:sudo apt-get install python3.11-dev
- pip version 24.0 (
pip install pip==24.0
, omegaconf 2.0.6 has a non-standard dependency specifier PyYAML>=5.1.*. pip 24.1 will enforce this behaviour change and installation will thus fail.)
pip install -r requirements.txt
pip install -r requirements_windows.txt
If you do not have admin rights and encounter a permission denied
error when using pip install
, use python -m pip install ...
instead.
By doing so you will also ensure that the directory structure for the model checkpoints will be created.
pytest -v --disable-warnings test_embedding_creation.py
Again, for windows in case of restricted permissions, use python -m pytest -v --disable-warnings test_embedding_creation.py
.
Download the ones that are available from here and create directories corresponding to the pipeline-names and place the checkpoints within them.
Modify the config.yaml file in the root directory to specify the path to your dataset. Define what models to run by specifying the strings in the embedding_model list (copy and paste as needed). If you want to run a dimensionality reduction model (currently only supporting UMAP), specify the name in the dim_reduction_model variable.
Once the configuration is complete, execute the run_pipeline.py file (make sure the environment is activated)
python run_pipeline.py
.
While the scripts are executed, directories will be created in the bacpipe.evaluation directory. Embeddings will be saved in bacpipe.evaluation.embeddings and if selected, reduced dimensionality embeddings will be saved in bacpipe.evaluation.dim_reduced_embeddings.
Please raise issues if there are questions or bugs. Also, please cite the authors of the respective models, all models are referenced in the table below.
Models currently include:
Name | ref paper | ref code | sampling rate | input length | embedding dimension |
---|---|---|---|---|---|
Animal2vec_XC | paper | code | 24 kHz | 5 s | 768 |
Animal2vec_MK | paper | code | 8 kHz | 10 s | 1024 |
AudioMAE | paper | code | 16 kHz | 10 s | 768 |
AVES_ESpecies | paper | code | 16 kHz | 1 s | 768 |
BioLingual | paper | code | 48 kHz | 10 s | 512 |
BirdAVES_ESpecies | paper | code | 16 kHz | 1 s | 1024 |
BirdNET | paper | code | 48 kHz | 3 s | 1024 |
AvesEcho_PASST | paper | code | 32 kHz | 3 s | 768 |
HumpbackNET | paper | code | 2 kHz | 3.9124 s | 2048 |
Insect66NET | paper | code | 44.1 kHz | 5.5 s | 1280 |
Mix2 | paper | code | 16 kHz | 3 s | 960 |
Perch_Bird | paper | code | 32 kHz | 5 s | 1280 |
ProtoCLR | paper | code | 16 kHz | 6 s | 384 |
RCL_FS_BSED | paper | code | 22.05 kHz | 0.2 s | 2048 |
SurfPerch | paper | code | 32 kHz | 5 s | 1280 |
Google_Whale | paper | code | 24 kHz | 5 s | 1280 |
VGGish | paper | code | 16 kHz | 0.96 s | 128 |