Font Size: a A A

Research On Speaker Recognition Over Noisy Utterance

Posted on:2020-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:L F BaFull Text:PDF
GTID:2428330590471823Subject:Control engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology and mobile internet technology,speaker recognition is mainly used for speaker identification as a special biometric technology in some particular occasions.The technology is a research hotspot in the field of pattern recognition and artificial intelligence,and is widely used in judicial identification,identity verification,military defense,information security,remote control and so on.When the training and test environment is matched,the recognition accuracy of the speaker recognition system is high.However,due to the environmental noise,the training environment and test environment of the speaker recognition system is mismatched in actual application,and the performance of the system drops significantly.How to effectively improve the performance of speaker recognition system under noisy environment becomes a key research point.This thesis mainly studies the related content of speech enhancement and feature extraction,and proposes a non-negative matrix factorization optimization algorithm.The deep learning is proposed to study feature extraction to solve the adverse effects caused by noise factors,and improve the performance of speaker recognition system.Finally,the related graphical user interface is designed to record the voice signal and display the recognition results.This thesis mainly contains the following three parts:1.A non-negative matrix factorization optimization algorithm is proposed.This thesis analyzes the advantages and disadvantages of spectral subtraction and non-negative matrix factorization algorithms,and proposes a non-negative matrix factorization optimization algorithm to obtain reconstructed speech from noisy speech.The reconstructed speech has good speech quality.Combining the advantages of each algorithm,spectral subtraction,non-negative matrix factorization algorithm and non-negative matrix factorization optimization algorithm are fused to further enhance the generalization ability of the algorithm.Compared with the traditional speech enhancement algorithm,experimental results indicate that the enhancement effect of the non-negative matrix factorization optimization algorithm is better under the same conditions.The fusion algorithm has a better enhancement effect in most noise environments when compared with a single speech enhancement algorithm.2.The feature extraction method of deep and shallow features fusion is proposed.This thesis uses a deep automatic coding network to extract features from noisy speech.The automatic encoder based on deep confidence network can effectively filter the noise components in the speech,and mine deep expressions of hidden personality information from shallow features.Then the deep features and shallow features are input into the i-vector model and fused at the score level.Compared with the single feature parameters in noisy environment,the experimental results show that the feature fusion can describe the speaker information more comprehensively and improve the system recognition rate.3.A graphical user interface is designed based on MATLAB.The speaker recognition platform interface is designed with MATLAB's own toolbox and built-in functions.And the system is tested by recording and recognizing the speech signal.The results show that the system has good interactivity.
Keywords/Search Tags:speaker recognition, speech enhancement, deep feature, i-vector
PDF Full Text Request
Related items