Skip to content

Latest commit

 

History

History
65 lines (51 loc) · 2.77 KB

README.md

File metadata and controls

65 lines (51 loc) · 2.77 KB

music-genre-recognition

Introduction

Music genre recognition algorithm using a convolutional neural network (CNN).

The algorithm split up the input songs into 3s slices. The melspectrogram of every slice is then computed, giving us input images for the CNN. 1D convolutions are used.

Accuracy is around 80% at the moment.

Dataset

Download the GTZAN dataset here and extract it in a musics folder.

You can also get already formatted data for the CNN and the trained model here, and simply extract the files.

Requirements

The following modules can be installed via pip or using an IDE like PyCharm:

How to run

Train

To train the model you can run the following command:

python main.py -m train

This will load the audio files, format it and train the model.

Note that it can take a long time, especially with Tensorflow CPU and it will require some RAM.

You can also add the following parameters to the command:

  • --no-save-data Once the data is loaded and formatted, it is saved as a .npy file so that you don't need to do that part again. With this flag, the data won't be saved.
  • --no-save-model The trained model is saved in a .h5 file. With this flag, the model is not saved.
  • --load-data If you already have .npy files with the formatted data, this flag will load them instead of loading data from audio files. Data won't be saved again afterwards.
  • --debug Enable debug mode (shows more information).

Test

You will need a trained model to test.

Once you have your model and your audio file, put them in the root folder.

To test the model against your own audio file, run the following command:

python main.py -m test -song your_song.mp3

This will load the file, process it and run it through the model. If you want to make this test for multiple song, put them all in a folder and run:

python main.py -m test -folder your_folder

All the files within the folder must be audio files, and this folder must be in the root folder.

You can add the following parameter:

  • --debug Enable debug mode (shows more information).

Results

Accuracy of the model is 82%, however it works best with old songs (before 2000) since the dataset used to train it was created around this time. Since music has evolved, it will make more mistakes on modern songs.