Font Size: a A A

Research On Single-channel Speech Separation Technology Based On Dictionary Learning And Deep Neural Network

Posted on:2023-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y BuFull Text:PDF
GTID:2568306836472484Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
In real life scenes,there is often background noise or interfering speech.The purpose of speech separation is to reconstruct target speech from the disturbed speech,and suppress or eliminate the interference,so as to improve the perceptual quality and intelligibility.This paper mainly studies single-channel speech separation,that is,how to separate the target speech from the interferenced speech in one-way observation signal,or how to minimize the influence of the interference signal.Based on dictionary learning and deep neural network,this paper analyzes and studies singlechannel speech separation.Dictionary learning uses dictionary atoms to linearly represent speech signal features,and deep neural network has a great nonlinear capacity for learning.The singlechannel speech enhancement method based on dictionary learning can remove noise more effectively,but it has limitations in separating two similar signals.Therefore,a two-stage speech separation system is proposed in this paper,which uses dictionary learning to separate mixed speech signals,and uses deep neural network to enhance estimated signals.The main research contents and innovations of this paper are as follows:(1)The background,significance and research status of speech separation at home and abroad are profiled.Several common speech signal mixing models are introduced.The importance of sparse representation and the training of dictionary learning is expounded,and classical model of deep neural network is presented.The common training objectives and acoustic characteristics are introduced.(2)Due to the insufficient difference between the sub-dictionaries of the joint dictionary obtained based on the traditional dictionary learning method,the phenomenon of "cross projection" will appear when the speech signal is represented in the joint dictionary,which makes the separation performance poor.To solve this problem,a single-channel speech separation method based on enhanced constrained dictionary learning is proposed in this paper.The constrained optimization function is used to enhance the constraints on dictionary learning in the algorithm.The constrained function is divided into three parts.The first part inhibits the error of reconstructed signal and target signal,and restrains the projection of clean signal on the corresponding sub-dictionary.The second restricts the error of the clean signal in the joint dictionary.The third suppresses the projection of the clean signal on other sub-dictionaries and limit the atomic correlation between dictionaries.The proposed algorithm can achieve better separation effect through experimental results.Compared with the traditional speech separation method based on joint dictionary,the algorithm improves the performance of speech separation system.(3)In view of the poor separation effect of using dictionary learning method when the signals are relatively similar,a single-channel speech separation method based on the combination of dictionary learning and deep neural network is proposed in this paper.The preliminary estimated signal through dictionary learning is obtained,and then the signal based on DNN is enhanced to approach the target signal in this method.Firstly,the method based on dictionary learning is used to separate speech and speech,and the preliminary estimated signal is reconstructed.Then,the DNN with strong mapping ability is used to separate speech and cross projection residue and obtain fine estimated signal.Compared with the speech separation method based on joint dictionary,this method further improves the accuracy of similar signal separation.(4)In order to further improve the speech separation performance of dual-output deep neural network,a single-channel speech separation method based on two-stage deep neural network is proposed in this paper.Firstly,DNN is used to obtain the preliminary estimation signal,and then DNN is used to slightly adjust the signal to reduce the error between the preliminary estimated signal and the target signal,so as to obtain the fine estimated signal.In addition,different loss functions are selected for different task objectives of DNN in two stages.In the first stage,on the basis of constraining the training target,the constraint of signal amplitude spectrum is added and the joint relationship between signals is mined.In the second stage,the training target and the signal amplitude spectrum are both constrained.Compared with the single-channel speech separation based on onestage method,experiments show that the algorithm effectively improves the performance of speech separation and the accuracy of signal separation.
Keywords/Search Tags:Single-channel speech separation, Dictionary Learning, Enhanced Constraint, Optimization Function, Deep Nneural Network, Loss Function, Speech Enhancement
PDF Full Text Request
Related items