Starting Date: 06/2023
Prerequisites: Machine learning, deep learning, MATLAB or python programming skills
Will results be assigned to University: No
Automatic sound classification attracts increasing research attention owing to its vast applications, such as robot navigation, environmental sensing, musical instrument classification, medical diagnosis, and surveillance. Sound classification tasks involve the extraction of acoustic characteristics from the audio signals and the subsequent identification of different sound classes. The broad range of sound classification deployments can be categorized into several disciplines, which include speech recognition, music instrument identification, environmental sound classification and abnormal sound identification for disease diagnosis. In comparison with speech and music sounds which possess proper high-level structures, the categories of diagnostic (such as respiratory and heart) sounds and environmental audio signals tend to be unstructured, which contain various clinical and natural acoustic noise.
It is thus important to generate effective audio representations to capture important discriminative characteristics and environmental acoustic cues to inform audio classification. In this respect, deep neural networks have demonstrated superior performances for signal processing tasks owing to their significant feature learning capabilities. Moreover, the hybridization of Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) has gained increasing popularity because of the significant capabilities in spatial-temporal feature extraction.
In this study, we will explore diverse hybrid CNN-RNN networks for sound classification, in view of their enhanced capabilities in audio representation. The proposed networks will be evaluated using diverse environmental sound data sets.