Research On Speaker Verification And Its Lightweight Method Based On Deep Neural Network

Posted on:2023-11-09

Degree:Master

Type:Thesis

Country:China

Candidate:Z J Jiang

Full Text:PDF

GTID:2568306830486244

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the development of intelligent speech technology,speaker verification technology has gradually entered our daily life and work.There are two research hotspots in the field of intelligent speech processing: How to further reduce the error rate of speaker verification and how to deploy speaker verification models to the terminals with low-computing resources.This thesis focuses on the problems of speaker verification and its lightweight method based on deep neural network.Main contributions of this thesis are as follows.This thesis proposes a speaker verification method based on Attentive Dilated Res2 Net Recurrent Network(ADRRN).The proposed ADRRN consists of convolutional initial layer,one-dimentional(1-D)dilated Res2 Net block,Residual bidirectional long short-term memory(RBLSTM)block,channel attentive statistical pooling layer and additive angular margin Softmax(AAM-Softmax)classifier.First,the input of ADRRN is the logarithm Mel spectra(LMS)extracted from the input speech sample.Then,the ADRRN is trained to learn speaker embedding(SE)from LMS which effectively characterizes local spatial information and global temporal information.Finally,the speaker embeddings are passed to the backend classifier for calculating the similarity by cosine similarity metric(CSM)or probabilistic linear discriminant analysis(PLDA).Equal error rate(EER)and minimum detection cost function(min DCF)are used to evaluate the performance of speaker verification.Three speech datasets selected from Vox Celeb1 and Vox Celeb2 are used for evaluation.The experimental results show that the proposed method is superior to the state-of-the-art methods of speaker verification.The proposed method also outperforms most baseline methods in terms of computational complexity and storage space.When evaluated on the experimental data with different lengths,the proposed method shows formidable generalization ability.The proposed ADRRN has high computational complexity and takes up large storage space,so it can’t be properly deployed to the terminals with low-computing resources.To overcome the above shortcomings,this thesis proposes a lightweight method based on deep representations grouping and interaction for speaker verification.The proposed module for deep representations grouping and interaction consists of initial layer,groups mean pooling layer,interaction layer,fusion layer,groups normalization layer.The proposed module for deep representations grouping and interaction is embedded in the convolutional initialization layer,1-D dilated Res2 Net block and RBLSTM block,which can reduce the model complexity.Multiply-accumulate operations(MACs)and model parameters(MP)are used for evaluating model complexity.Three speech datasets selected from Vox Celeb1 and Vox Celeb2 are used for evaluation.The experimental results show that the proposed method brings about a great decrease of computational complexity and model parameters with slight sacrifice of EER and min DCF.The proposed method is superior to other state-of-the-art lightweight methods in both model lightweight and speaker verification performance.In addition,the proposed lightweight method can be applied for the lightweight of speaker embedding extraction network with different structures.In conclusion,this thesis focuses on the problems of speaker verification and its lightweight method based on deep neural network.What’s more,this thesis proposes a speaker verification method based on the ADRRN and a lightweight method based on deep representations grouping and interaction for speaker verification.This thesis carries out multiple experiments and make a comparison between the proposed methods and other stateof-the-art methods to prove the effectiveness of the proposed methods.

Keywords/Search Tags:

Deep neural network, Speaker representation, Speaker verification, Lightweight

PDF Full Text Request

Related items

1	Speaker Extraction And Verification Based On Deep Learning
2	Text-Dependent Speaker Verification System
3	A Study On The Generative Modelling For Speaker Verification Based On Deep Neural Network
4	Research On Voiceprint Verification Technology In Multi-speaker Scenarios Based On Deep Learning
5	Pulse Coupled Neural Network (pcnn) In The Spectrogram-based Speaker Recognition
6	Research On Text-independent Speaker Verification Based On Deep Learning
7	Content-independent Speaker Verification Modeland Its Application
8	Research On Speaker Recognition Technology Based On Voiceprint Information Space
9	Study On The Deception Detection Method Identified By The Automatic Speaker Verification System
10	Research On Speaker Recognition Based On Deep Neutral Network