Font Size: a A A

Research On Speaker Verification Based On Target Adaptation Learning

Posted on:2024-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:C Q JiFull Text:PDF
GTID:2558306920455424Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the era of rapid development of artificial intelligence and big data,biometrics is widely used in all aspects.Among biometric recognition technologies,speaker verification technology has been widely concerned by researchers.Speaker verification technology accords with the safety principle of biometric recognition and takes into account the physiological and behavioral characteristics of biometric features.This dissertation addresses the problems of speaker verification techniques in the research processand analyze speaker verification from three aspects: adaptive adjustment of training features,deep network feature representation learning and adaptive measurement target learning.The specific research methods are shown as follows:(1)Considering the mismatch between training stage and testing stage of speaker verification,a speaker verification method based on target sample mining is proposed.At the same time,the method can solve the problem of gradient imbalance between positive and negative samples,overlap of similarity between positive and negative samples and divergence of spatial distribution of positive and negative samples.The introduced objective function is an adaptive objective function,which can mine information according to the characteristics of the sample.The adaptive objective function can also guide the stable convergence of the feature network,and improve the generalization ability of the model.The experimental results show that the proposed method can effectively solve the problem of positive and negative samples in speaker verification.(2)Considering the problem of gradient vanishing or gradient explosion with the increasing number of layers in the deep neural network,a speaker verification method based on residual network adaptive learning is proposed.In this method,the deep residual network(ResNet)is introduced,which has direct connection between different layers of the networks.Under the condition of ensuring the advantage of deep neural network in feature representation,the problem of deepening layers of neural network is solved.Meanwhile,a learnable activation function PReLU is introduced,then the network for the expression of features can be more flexible.And this network is named ResNet-P.In addition,based on the objective function of adaptive sample mining,the angular margin is extended.We use three different angular margin objective function to guide ResNet-P network update.The experimental results show that the proposed method has better performance than other deep learning methods.(3)In order to measure the relationship between features more accurately,an objective function representation method based on mutual information adaptive estimation is proposed.This objective function introduces an adaptive metric learning method,and the optimization objective is maximizing the intra-class similarity and minimizing the inter-class similarity.Meanwhile,the objective function can dynamically adjust the similarity according to the real distribution of deep features.Based on dynamically adjusting,the deep neural networks can optimize towards the direction of stronger discrimination.In addition,this adaptive metric method is used for feature sampling,and it can update the parameters according to the characteristics of the features.Thus,the feature can be more typical and beneficial to improve the supervised ability of the optimization direction of the deep neural networks.Experimental results show that,compared with other deep neural network,the proposed method has better performance,and the performance of the speaker confirmation system improves significantly.
Keywords/Search Tags:Speaker verification, Objective function, Adaptive learning, Representation learning
PDF Full Text Request
Related items