| With the popularity of intelligent speech and human-machine interaction technology,speaker recognition is faced series of application challenges.Because of the complexity of the practical application environment,especially the influence of noise and cross-channel,these problems greatly reduce the accuracy of speaker recognition.In order to solve these problems,this paper improves the performance of speaker recognition in practical application by studying the key technology of speaker recognition based on deep network and analyzing deep feature and deep model.This paper is carried out by the following three parts:1.For the more representative speaker feature,this paper presents a deep feature named Tandem Feature.Tandem Feature is spliced into the Bottleneck Feature extrated by Deep Neural Networks(DNN)and spectral features of corresponding frames.The i-vector is generated based on the Tandem Feature,and text-independent speaker recognition system is implemented through the PLDA(Probabilistic Linear Discriminant Analysis)model.The NIST SRE 2010 experiment shows that a lower EER(Equal Error Rate)can be obtained by using the Tandem Feature based on the i-vector compared with the traditional i-vector,in the same experimental configuration.2.In order to verify the feasibility of the sequence Embedding,the text-dependent speaker recognition system based on d-vector is built in this paper.Based on DNN,uses the speakerID as the label for DNN output layer training,in the extraction stage we extract the bottleneck feature from this network.The bottleneck feature is accumulated and averaged to obtain the d-vector model of each speech.Experiments on King-ASR-L-057 text-dependent test set prove that the d-vector model has good distinguishability.3.In order to reduce the impact of environmental noise and cross-channel,a text-dependent speaker recognition system based on DNN i-vector model is also discussed in this paper.Based on the speech recognition DNN-HMM,using DNN output posterior probability instead of GMM to estimate statistics and extract DNN i-vector,and noisy training strategy for the deep network is designed.Proved by the experiments carried on King-ASR-L-057,the noisy database and RSR2015,DNN i-vector gets lower EER and has robust noise resistance compared with traditional text-dependent recognition system. |