Skip to content

Latest commit

 

History

History
43 lines (32 loc) · 2.66 KB

README.md

File metadata and controls

43 lines (32 loc) · 2.66 KB

EAT

END-TO-END AUDIO STRIKES BACK: BOOSTING AUGMENTATIONS TOWARDS AN EFFICIENT AUDIO CLASSIFICATION NETWORK

Abstract

While efficient architectures and a plethora of augmentations for end-to-end image classification tasks have been suggested and heavily investigated, state-of-the-art techniques for audio classifications still rely on numerous representations of the audio signal together with large architectures, fine-tuned from large datasets. By utilizing the inherited lightweight nature of audio and novel audio augmentations, we were able to present an efficient end-to-end (e2e) network with strong generalization ability. Experiments on a variety of sound classification sets demonstrate the effectiveness and robustness of our approach, by achieving state-of-the-art results in various settings. Public code is available at: https://github.com/Alibaba-MIIL/AudioClassification.

Results and Models

speechcommand(35)

Model Scale Lr schd Param(K) Flops(M) Top-1(%) config pth onnx ncnn
ETA-tiny 8192 Poliy 23.95 0.53 86.2 config model onnx ncnn

speechcommand(4)

Model Scale Lr schd Param(K) Flops(M) Top-1(%) config pth onnx ncnn
ETA-tiny 8192 Poliy 22.81 0.53 97 config model onnx ncnn

Citation

@article{Gazneli2022EAT,
   title={End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network},
   author={Avi Gazneli, Gadi Zimerman, Tal Ridnik, Gilad Sharir, Asaf Noy},
   journal={arXiv preprint arXiv:2204.11479,},
   year={2022},
}