Algorithm Research Based On Neural Network For Audio Scene Recognition

Posted on:2020-12-13

Degree:Master

Type:Thesis

Country:China

Candidate:G J Liu

Full Text:PDF

GTID:2518306131964239

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Audio scene recognition is a new research field in recent years.It aims to categorize scenes through background.Intelligent devices can make use of background information extracted from the current audio to adjust the parameters of the system or application to meet the personalized needs of users.Audio scenes usually show a high degree of variability.This variability not only exists between different scenarios,but also within the same scene.As a result,audio scene recognition is arguably one of the most challenging tasks in statistical pattern recognition at present.Compared with some traditional fields of audio processing,such as speech recognition,there is still a big gap in the accuracy of audio scene recognition.In this paper,to solve the problem of low classification accuracy in current audio scene recognition,from the aspects of audio processing,signal representation,feature extraction,design of classification model and so on,a kind of audio scene recognition method based on neural network is proposed.The purpose of this study is to get an effective and feasible audio scene recognition system.In a laboratory environment,suitable audio data sets are used to evaluate the system.The detailed work is as follows:(1)For the signal processing module,three data augmentation methods are used.The central and side channel are separated from the binaural stereo sound.The harmonic source and impulse source are separated from mono channel audio.The background difference method with different median filter sizes is used to process the generated spectrum,and the obtained data is used to train the classifier model.(2)For the feature extraction module,Mel-frequency cepstral coefficient is adopted.The appropriate frame length,frame shift and the number of filters are designed to ensure the feature while greatly reducing the feature dimension and the computational complexity.(3)For the design of classification system,after understanding the principle and method of neural network classifier,the most appropriate convolution neural network is selected.Meanwhile,according to the number of input signal channels,two different convolution neural network structures have been designed,one for single channel input signal,another for binaural input signal.The experimental results show that these two network structures have stronger learning ability than the simple convolutional neural network.(4)For ensemble learning module,following each part of the tasks of different sub-models,the integration method is used to integrate the results of all sub-model experiments,and appropriate weight parameters are set to obtain the final classification results.The accuracy of the integrated learning experiment is greatly improved compared with that of the sub-models.According to theoretical analysis and experiments,data augmentation processing increases the volume of audio data and provides more experimental samples for feature extraction and classifier training.Compared with the traditional pattern recognition method GMM,the proposed two network structures obtain the performance improvement up to 5.4%.Compared with single classifier network,the classifier based on ensemble method has better classification performance.

Keywords/Search Tags:

Audio Scene Recognition, Data Augmentation, Convolutional Neural Network, Ensemble Learning

PDF Full Text Request

Related items

1	Research On Acoustic Scene Recognition Algorithm Based On Convolutional Neural Network
2	Research Of Indoor Scene Recognition Method Based On Convolutional Neural Network
3	Research On Audio Scene Recognition Based On Deep Learning
4	Audio Scene Recognition Based On Deep Neural Network Of Multiple Optimization Mechanisms
5	An Effective Audio Classification Method Based On Data Augmentation Strategy
6	Feature Augmentation And Model Build For Acoustic Scene Classification With Multiple Devices
7	Convolutional Neural Network Based Research On Image Understanding
8	Research And Application Of Automatic Augmentation Of Text Data Based On Neural Network Architecture Search Ideology
9	A Study On The Methods Of Handwritten Numeral Recongnition Based On Ensemble Convolutional Neural Network
10	Research On Application Of Data Augmentation Based On Different Speech Habits In Speech Recognition In Telephone Scene