Font Size: a A A

Research On Separation Of Singing Voice And Accompaniment In Music Signal

Posted on:2019-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:M XiongFull Text:PDF
GTID:2415330590965691Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the coming of the information age,the demand for music signal processing technologies such as music labeling,retrieval,recognition and singing pitch tracking that base on the massive digital music is increasing.Among them,the separation of singing vocal and accompaniment in the music signals,as earlier stage processing of these technical studies,has attracted more and more attention.A better separation system of singing voice and accompaniment can bring convenient and good performance for the later processing.Hence,it has important research value.However,the separation of the singing voice and accompaniment differs from the de-noising technology in the audio signal.So,the mutual interference between them has brought many challenges to the academic research.The thesis mainly researches on the separation of singing voice and accompaniment in music signals,includes below several aspects:(1)In order to solve the problem that non-negative matrix factorization(NMF)has poor adaptability in the separation and over-reliance on learning samples,a NMF method combined with Harmonic Percussive Source Separation(HPSS)is produced.Firstly,the HPSS algorithm is used to separate the music signals at a high resolution.Secondly,the harmonic source is preserved and the percussive source is secondarily separated by using a flexible NMF.Lastly,the spectrum of the accompaniment and singing separated by the ideal Binary Mask IBM is corresponding to the inverse Fourier Transform.Studies have shown that combining the advantages and disadvantages of the two separation algorithms can effectively improve the separation performance.(2)A music separation method based on deep neural network(DNN)is proposed in order to solve the problem of difficultly separating the singing voice and accompaniment in music signals.Firstly,on the basis of the DNN model,and considering the reconstruction errors and discriminative information between singing and accompaniment,an improved objective function is proposed for discriminative training.Secondly,an additional layer is added to the DNN mode,and the time-frequency masking is introduced to jointly optimize the estimated signals.The corresponding time domain signal is obtained by the inverse Fourier Transform.The research shows that the DNN model can obtain the characteristics of music signals,and the separation performance is greatly improved.(3)According to the strong dynamic modeling capabilities of the deep recurrent neural network(DRNN),which means that analyze the data relation by using the dependent relation of past signal time,reasonable predictions of the current or future signals from,which produces a music separation method based on the DRNN.According to the separation method based on DNN,on the basis of using the discriminative objective function training the DRNN parameters,and introducing the time-frequency masking model,the DRNN model is generated.The research shows that the music separation model under the DRNN can reflect the information of the music itself and greatly improve the separation performance.
Keywords/Search Tags:the separation of singing voice and accompaniment, non-negative matrix factorization, deep neural network, deep recurrent neural network
PDF Full Text Request
Related items