Font Size: a A A

Research On Data Mining For Voiceprint Based On Deep Learning

Posted on:2022-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:X D LiFull Text:PDF
GTID:2518306545486334Subject:Mathematics
Abstract/Summary:PDF Full Text Request
In the information age,intelligent data mining methods represented by deep learning play an important role in various fields at present.It has become inevitable to study how to efficiently use intelligent data mining methods to obtain valuable information from massive information.Through intelligent data mining technology to realize open set voice print recognition,so as to achieve the rapid and accurate identification of the speaker has important practical significance.Due to the inadequacy of traditional methods in the recognition of speakers inside and outside the set,there will be a large rate of misidentification.Therefore,the choice of parameters that can accurately reflect the personality characteristics of speakers and the calculation of threshold value have become the bottleneck problems in open set voice print recognition.Therefore,this paper uses the depth confidence network stacked by three-layer of restricted Boltzmann machines as the depth acoustic feature extractor,and the24-dimensional basic acoustic feature Mel-Frequency Cepstral Coefficients is mapped to256-dimensional feature space to obtain the depth acoustic feature parameters containing more individual characteristics of the speaker,and further research obtain open-set adaptive threshold calculation algorithm.The first method to calculate the threshold: train the Gaussian mixture model to calculate the similarity value of the deep voiceprint feature,and calculate the maximum inter-class variance of the similarity value through the OTSU algorithm,when the inter-class variance was the maximum,the similarity value at this time was the optimal threshold.The second method of calculating the threshold: the traditional SVDD optimization objective is optimized,and the MSM-SVDD model is proposed to calculate the threshold value.The decision boundary is adjusted by adding the soft interval maximization regular term between the target speaker sample and the non-target speaker sample.While minimizing the volume of the hypersphere,the decision boundary is shifted to the non-target speaker sample,thus improving the generalization ability of correct acceptance of the target speaker test sample.The influence of the regularization coefficient Gaussian kernel parameter on the model recognition effect is analyzed.The experimental results show that the computational threshold algorithm based on deep learning proposed in this paper has lower false rejection rate and false recognition rate,and can completely reject the speaker outside the set.
Keywords/Search Tags:Intelligent data mining, voice print recognition, Deep neural network, OTSU algorithm, Support vector data description, Adaptive thresholds
PDF Full Text Request
Related items