Font Size: a A A

Audio Feature Extraction And Acoustic Scene Recognition Research

Posted on:2024-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:W S WangFull Text:PDF
GTID:2568307178971319Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Audio features are important indicators that represent the properties of audio.A highquality audio feature contains key information of audio data,which can play a crucial role in audio recognition tasks.Nowadays,audio information is all over the Internet,and the demand for audio processing in some important areas are urgently required to be improved.The previous analysis of audio signals in the time and frequency domains is different from this research,which focuses on feature extraction from the audio cepstral domain and combines cepstral domain features with frequency domain features to obtain a multidimensional feature vector that stores audio information characteristics.Based on this feature vector,a new network architecture for audio classification is constructed,which ultimately improves the accuracy of audio recognition under multi-scene conditions.A detailed comparative study of several features of the spectrogram is carried out in this thesis.Combining techniques in Computer Vision,the five extracted features are fused using the early fusion method’s merge fusion criteria to extract a cascade mixed-type fusion feature that results in better classification performance.In the process of experiment,the Melspectrogram,chroma frequencies,spectral contrast,and tonal centroid features are concatenated with the Mel-frequency cepstral coefficients and then the extracted fusion features are dimensionally reduced using the Principal Component Analysis technique to facilitate learning by the classification network.In terms of recognition models,two novel fusion methods are designed based on the criterion of model fusion: the DNN-Tree Model cascaded fusion and parallel fusion methods.In the end,the scheme of feature extraction and the algorithm of audio recognition designed are verified.To prove the advantages of feature fusion designed experiment,the audio features are extracted from the datasets according to the increasing number of features.With the increase of the number of fusion features,the network recognition effect becomes better and better.In the experiment of verifying the audio classification recognition algorithm designed,the recognition model uses DNN(Deep Neural Network),three integrated algorithm architectures based on gradient boosting decision trees,the serial DNN-Tree Model and parallel DNN-Tree Model models designed are verified.All algorithm schemes are tested on datasets,the accuracy is improved by 3.44%,4.49%,and1.05% respectively by the three parallel DNN-Tree Model fusion models compared to the three original gradient boosting trees,as shown by the experimental results on the Urbansound8 k dataset.The accuracy is improved by 4.91%,4.97%,and 1.2% respectively by the three parallel DNN-Tree Model fusion models compared to the three original gradient boosting trees,as shown by the experimental results on the Alibaba Tianchi food voice dataset.
Keywords/Search Tags:Audio feature extraction, Feature fusion, Gradient boosting decision tree, Parallel DNN-Tree Model, Model fusion
PDF Full Text Request
Related items