Font Size: a A A

Research On Indoor Sound Source Localization Algorithm Based On Deep Learning

Posted on:2022-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:W H ZhangFull Text:PDF
GTID:2568307040466174Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of artificial intelligence,a basic function of many smart devices is to realize sound source localization,such as intelligent robots and video conference systems.At present,how to improve the robustness and adaptability of sound source localization algorithms in harsh environments such as strong reverberation and strong noise is a problem that needs to be solved urgently.This thesis extracts the features containing position information from the microphone array signal and combines it with the deep learning algorithm to obtain a robust and adaptable sound source localization algorithm.This thesis proposes four indoor sound source localization algorithms based on deep learning:(1)Convolutional neural network sound source localization(Phase-CNN)algorithm based on the phase component of Short Time Fourier Transform(STFT).The algorithm uses the STFT phase component that contains time delay information and has low computational complexity as the network input feature,and builds a convolutional neural network that can achieve both azimuth classification and elevation classification tasks at the same time.Simulation experiment analysis shows that the algorithm is robust and adaptable in Gaussian noise and reverberation environments.Compared with the Multiple Signal Classification(MUSIC)algorithm,its positioning accuracy is higher in strong reverberation environments.(2)Convolutional neural network sound source localization(SGCC-PHAT-CNN)algorithm based on the Smoothed Generalized Cross Coerration Phase Transform(SGCC-PHAT).In order to reduce the influence of noise and reverberation on the feature,the algorithm uses the SGCC-PHAT feature after averaging and smoothing the phase-weighted generalized cross-correlation(GCC-PHAT)feature.Experiental analysis shows that an increase in the number of smoothing frames can improve the positioning performance of the algorithm,and it is adaptable to unknown reverberation and noise environments.Compared with the Phase-CNN algorithm and the MUSIC algorithm,the algorithm has higher positioning accuracy.(3)Residual neural network sound source localization(SGCC-PHAT-CRN)algorithm based on smoothed phase-weighted generalized cross-correlation(SGCC-PHAT).In order to use a deeper network model to fit more complex nonlinear mapping functions to improve positioning performance,and to solve the problem of network degradation caused by too deep layers,the algorithm uses a residual network structure.The simulation experiment results show that the algorithm has strong robustness and adaptability in Gaussian noise and reverberation environment.Compared with the SGCC-PHAT-CNN algorithm,the algorithm has better localization performance.(4)Convolutional neural network sound source localization(FLO-SGCC-PHAT-CNN)algorithm based on the Smoothed Fractional Low-Order Generalized Cross Correlation Phase Transform(FLO-SGCC-PHAT).The algorithm adopts the FLO-SGCC-PHAT feature which fully considers the noise pulse characteristics and is smoothed.The test results show that the network model trained under the impulse noise environment with high generalized signal-to-noise ratio has better positioning performance in the environment of impulse noise with low generalized signal-to-noise ratio and strong reverberation,and the performance of the algorithm is better than that of SGCC-PHAT-CNN algorithm.The algorithm has good adaptability to impulse noise of different intensities,and it is also suitable for sound source localization under Gaussian noise.
Keywords/Search Tags:Sound Source Localization, Deep Learning, Convolutional Neural Networks, Residual Neural Network, Alpha-Stable Distribution
PDF Full Text Request
Related items