Skip to content

Latest commit



48 lines (42 loc) · 2.61 KB

File metadata and controls

48 lines (42 loc) · 2.61 KB


Scripts for creating spectrograms as well as detecting and classifying bird sounds in audio signals.


We closely follow the work of Stefan Kahl et al. You can find installation instructions on that page, it should work like this:

git clone <this-repository>
cd Birdsong_Classification
sudo pip install –r requirements.txt
sudo apt-get install python-opencv
sudo pip install Theano==1.0.4
sudo pip install

Some remarks about configuring Theano: The command

python -c 'import theano; print(theano.config)' | less

shows your current configurations. To change them, create a file .theanorc in your $home-directory and type


in it (to use gpu0). Check out the Theano documentation for more information. You can also change your configurations on the fly by setting the flags when you run the script, e.g.:

THEANO_FLAGS='device=cuda1' python


Your .wav-files should be in a folder called dataset/train/src/ with a separate subfolder for each species.


  • python Moves ten percent of the train data to a dataset for testing in dataset/test/.
  • python Creates spectrograms with original method. Don't worry about the warnings.
  • python Creates spectrograms with our method. Check your available ressources.
  • python Trains a neural net on the spectrograms created with our method. See below for further information.
  • python Trains a neural net using the original parameters. Important: Check the configurations in the script. Remarks:
    • You can download noise samples here and save them in the specified folder.
    • If you are merely testing your code, you should use a smaller subset of your dataset. In this case modify the values for MAX_CLASSES and MAX_SAMPLES_PER_CLASS.
    • MODEL_TYPE = 3 could be sufficient for datasets with fewer classes.
    • The current version does not display the confusion matrix because of errors. However, we "only" need this matrix to analyse the results.
    • The pretrained model uses MODEL_TYPE = 1 and is trained on 1500 classes.
    • To document our results we should save the output of the spript by calling python | tee experiments/sensible_filename.txt.

Testing und evaluation

In progress.