- Voice activity detection
There are some realization of VAD algoritms. Zero Crossing rate, Energy, Magnitude realizations through Theano library. Using this article for algoritms.
- Vectorization
Vectorization of voice using lasagne. With voice vectors we can do something intresting... Predict gender, accent or to compare voices.
- Service
We want to do service with API for predicting voice vector.
AUC score on our dataset with task 'same or not same voices': 0.8923 Dataset includes 51k voices with metainformation about authors.
- Voice authorization
- Personalization voice servoces
- Clastering audio files