Research Of Speech Separation Technology Of Multiple Speakers Based On Deep Learning

Posted on:2020-04-07

Degree:Master

Type:Thesis

Country:China

Candidate:Z F Li

Full Text:PDF

GTID:2428330578450931

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Voice communication is an indispensable part of human life.The content of voice information is more direct and richer than text,and the voice contains the speaker's emotions.Noisy language communication environment often appears in our real life,our brain can hear all kinds of mixed sounds according to the human ear,it is easy to distinguish the sound of attention.The famous "cocktail party" effect is that our brains can easily pick up sounds that interest us when mixed with multiple sounds.Human society is developing towards convenience and intelligence.A clear voice can enable intelligent appliances to more efficiently execute the instructions issued by the host.How to solve the problem of "cocktail party" has become the holy grail of speech separation.How to correctly distinguish the speaker's speech content in the mixed speech,reduce the noise in the separated speech,and improve the accuracy and quality of speech separation has become a hot topic in recent speech separation research.Mixed speech signal is complex and changeable.It is difficult to get the rule of speech in mixed speech signal.This paper is based on the deep learning framework,gives full play to the characteristics of deep learning,can adapt to different speech scenes,provides a solution to the problem of speech separation.Single-channel speech separation is a common method.Single-channel speech separation can not give full play to the role of multi-channel speech signals,resulting in low resource utilization.In the case that the interference is the speaker,this paper mainly carries out the following two studies:(1)The traditional k-means clustering algorithm randomly selects the initial center point.The division of clustering clusters is based on the initial center point.Improper center points are easy to cause the division error of clustering clusters,and the results of each iteration of clustering are unstable.The improved k-means clustering algorithm in this paper is also known as HK-means clustering algorithm.This algorithm uses the histogram as the characteristic of density estimation to change the choice of initial center point and select dense area as much as possible.At the same time,in order to reduce the interference of global data on local data,local data substitution method is adopted to estimate the initial center point.(2)The speech separation model uses the Bi-directional Long Short-Term Memory(BLSTM)to realize the separation of speech signals by selecting the number of different hidden layers and the number of hidden layers.The mixed speech signal of single channel is involved in the training of speech separation model,and the mixed speech signal of double channel is separated by speech.The similarity judgment and the adaptive beamforming(MVDR)algorithm are used to enhance the separated speech signal,so as to further improve the quality of the separated speech.

Keywords/Search Tags:

Speech separation, deep learning, k-means algorithm, MVDR

PDF Full Text Request

Related items

1	Research On Speech Separation Algorithm Based On Fuzzy Clustering And Deep Learning
2	Speech Separation Method And Implementation Based On Deep Learning
3	Speech Separation Based On Deep Learning
4	Research And Design Of Speech Separation Algorithm Based On Deep Learning
5	Multi-speaker Speech Separation Based On Deep Learning
6	Research On Speech Separation And Recognition Based On Deep Learning
7	Rsearch And Implementation Of Single Channel Speech Separation With Unknown Number Of Speakers
8	Speech Separation Technology Based On Deep Learning
9	Research On Multi-Speaker Speech Separation And Speech Recognition In Noisy Environment
10	Binaural Speech Separation Research Based On Deep Learning