Research On Speaker Recognition Technology Based On Deep Learning

Posted on:2019-03-12

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Guo

Full Text:PDF

GTID:2428330545997943

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

With the popularity of intelligent speech and human-machine interaction technology,speaker recognition is faced series of application challenges.Because of the complexity of the practical application environment,especially the influence of noise and cross-channel,these problems greatly reduce the accuracy of speaker recognition.In order to solve these problems,this paper improves the performance of speaker recognition in practical application by studying the key technology of speaker recognition based on deep network and analyzing deep feature and deep model.This paper is carried out by the following three parts:1.For the more representative speaker feature,this paper presents a deep feature named Tandem Feature.Tandem Feature is spliced into the Bottleneck Feature extrated by Deep Neural Networks(DNN)and spectral features of corresponding frames.The i-vector is generated based on the Tandem Feature,and text-independent speaker recognition system is implemented through the PLDA(Probabilistic Linear Discriminant Analysis)model.The NIST SRE 2010 experiment shows that a lower EER(Equal Error Rate)can be obtained by using the Tandem Feature based on the i-vector compared with the traditional i-vector,in the same experimental configuration.2.In order to verify the feasibility of the sequence Embedding,the text-dependent speaker recognition system based on d-vector is built in this paper.Based on DNN,uses the speakerID as the label for DNN output layer training,in the extraction stage we extract the bottleneck feature from this network.The bottleneck feature is accumulated and averaged to obtain the d-vector model of each speech.Experiments on King-ASR-L-057 text-dependent test set prove that the d-vector model has good distinguishability.3.In order to reduce the impact of environmental noise and cross-channel,a text-dependent speaker recognition system based on DNN i-vector model is also discussed in this paper.Based on the speech recognition DNN-HMM,using DNN output posterior probability instead of GMM to estimate statistics and extract DNN i-vector,and noisy training strategy for the deep network is designed.Proved by the experiments carried on King-ASR-L-057,the noisy database and RSR2015,DNN i-vector gets lower EER and has robust noise resistance compared with traditional text-dependent recognition system.

Keywords/Search Tags:

Speaker Recognition, Deep Learning, Tandem Features, d-vector, DNN ivector

PDF Full Text Request

Related items

1	Research On The Performance Of Speech Features In Gender-based Speaker Recognition
2	Text Independent Speaker Recognition Based On Deep Learning Framework
3	Research On Key Algorithms Of Speaker Recognition Based On Deep Learning
4	Research On Deep Learning Models And Algorithms For Speaker Recognition
5	Research On Speaker Recognition Based On Deep Belief Network And Vector Quantization
6	Speaker Recognition System Based On Deep Learning
7	Research On Key Technologies Of Speaker Recognition Based On Deep Learning
8	Research On Speaker Recognition Technology Based On Deep Learning
9	Research On Three-dimensional Features Recognition Based On Deep Learning Speaker
10	The Application Of Speaker Recognition Technology Based On Deep Learning