| In order to solve the problem of low recognition rate of speech recognition based on traditional acoustic model in complex environment,this dissertation combines deep learning theory to study speech feature extraction and recognition algorithm from two aspects of speech feature parameters and acoustic model.This dissertation studies the human speech features,and analyses the human speech feature parameters and their extraction principles,and studies the basic algorithms of LPCC and MFCC speech feature parameters extraction.Through the analysis of MFCC speech feature parameters extraction algorithm,it is concluded that the algorithm has obvious high-pitch feature distortion problem.Therefore,an improved algorithm based on EMD decomposition and fractal theory is used to modify the high-frequency region of speech,in order to improve the distortion of high-frequency spectrum through the non-linear processing of the characteristic quantity of the high-frequency region.The dissertation also studies the MFCC extraction algorithm combined with EMD-FD extraction algorithm.Compared with the single MFCC extraction algorithm,it can extract the high-frequency features of speech signals more completely,and then improves the speech recognition rate.The dissertation also studies the deep learning theory and ANN-HMM model in speech recognition from the perspective of acoustic model modeling.Firstly,the dissertation studies the theoretical basis of traditional HMM model and GMM-HMM model.The performance of HMM model and GMM model for speech training is compared by simulation.The dissertation has been tested and verified that the recognition effect of GMM-HMM model is better than that of single HMM model,and it is almost not affected by speech training amount,so the recognition rate of acoustic model is improved.Then,the dissertation studies the feasibility and general methods of applying deep learning theory to speech signal processing.After that,the dissertation studies DNN and LSTM neural network model and training method,andemphatically analyzes the structure and principle of DNN neural network model and the training algorithm of restricted Boltzmann machine and the optimization of Dropout strategy parameters.Then,the dissertation studies the structure and principle of LSTM neural network model and BPTT training algorithm and its improved CSC-BPTT training algorithm and decoding problem.The simulation results show that in the LSTM neural network model,the training effect and speed of the CSC-BPTT training algorithm are much better than the traditional BPTT training algorithm.Finally,the performance and the feasibility of intelligent recognition of speech signal long phrase sound sequence are compared by simulation between traditional GMM-HMM model,DNN-HMM model and LSTM-HMM model.The simulation results show that the ANN-HMM model based on deep learning theory as an acoustic model can achieve better recognition effect and efficiency than the traditional GMM-HMM model. |