Font Size: a A A

Speech Recognition Technology Based On Hybrid Model Of HMM And DNN

Posted on:2021-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y L FengFull Text:PDF
GTID:2428330602994090Subject:Electrical engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of big data and artificial intelligence,the application of speech recognition has become more and more popular.Nowadays,many electronic products operate through voice interaction,which makes people more convenient to enjoy modern intelligent services.How to achieve more efficient speech interaction,reduce the impact of noise on the system recognition performance,and improve the recognition accuracy of the system is the focus of the study.In this paper,the preprocessing and feature parameter extraction of speech signal are studied firstly.For MFCC can only reflect the static characteristics of sound signal,while EMD can describe the non-stationary characteristics of signal more carefully.In this paper,EMD is integrated into MFCC feature extraction.The experimental results show that the improved feature parameter extraction method effectively improves the recognition effect of the system,and the recognition rate increases by 3.15% at different SNR.In the traditional acoustic modeling,Gaussian mixture model(GMM)and hidden Markov(HMM)mixture model have always been dominant.In this paper,a small vocabulary recognition system is established on MATLAB for experiments.Compared with a single HMM,gmm-hmm has less requirements for training data and better recognition performance.In order to solve the problem of GMM's lack of modeling ability for complex data,a deep neural network(DNN)with stronger modeling ability is used to replace GMM to get a new model structure,Based on the speech database of thchs-30,the continuous speech recognition system with large vocabulary is realized.The experimental results show that the recognition error rate of dnn-hmm model is significantly lower than that of gmm-hmm model,and the feature of fbank is more suitable for the training of deep neural network model than that of MFCC.In the environment of adding noise,the pre training of deep model by noise reduction self encoder(DAE)can recover the signal damaged by noise and improve the accuracy of recognition effectively.
Keywords/Search Tags:Speech recognition, Feature extraction, HMM model, Depth neural network, Back propagation algorithm
PDF Full Text Request
Related items