Research On Tibetan Speech Recognition Based On Deep Convolutional Neural Network

Posted on:2021-04-28

Degree:Master

Type:Thesis

Country:China

Candidate:Z D Huang

Full Text:PDF

GTID:2435330620475887

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Automatic speech recognition has been a core technique in call center,medical service and mobile application,etc.Nowadays,languages with rich corpus,e.g.,English and Chinese,have reached a satisfactory speech recogn ition.However,the Tibetan language speech recognition is compromised due to its lack of rich corpus and lan guage-particularity,the slow development of speech recognition technology.Improving the performance of Tibetan speech recognition system is an important research content in the field of speech recognition technology.This paper mainly studies the application of CNN in Tibetan speech recognition.The main work is as follows:1.Feature extraction.The speech signal is converted into a speech spe ctrum and the information in the speech signal is retained as the characteristic input of the deep convolutional neural network.2.Acoustic modeling.The convolutional neural network with good pe rformance in image recognition is introduced into Tibetan speech recognition to better capture the local information in the speech spectrum.3.End-to-end speech recognition.Combining the convolu tional neural network with the CTC,an end-to-end Tibetan speech recognition system is designed.4.Classifier structure optimization.The number of layers of convolutional neural network is further increased,and the feature extraction ability of the network is improved by using the method of superimposing convolutional layers.A comparative experiment was conducted on the Tibetan corpus esta blished by the above model in the laboratory,and the following conclusions were drawn:1.Transforming speech into spectrum as a feature extraction method can better retain the information in speech signal that is conducive to recognition.2.The use of convolutional neural network to extract speech features from speech spectra improves the performance of Tibetan speec h recognition.3.It is verified that the end-to-end Tibetan speech recognition system is feasible,and the recognition result is better than the recognition model using cross entropy as the loss function.4.Increase the number of layers of the convolutio nal neural network and select the appropriate activation function to further improve the performance of speech recognition.5.After the convolution layer,batch normalization processing and Dropout processing technology are added to "discard" neurons in a fixed proportion in network training to improve recognition performance while redu cing training time.

Keywords/Search Tags:

Tibetan, Speech Recognition, CNN, Dropout, Spectrogram

PDF Full Text Request

Related items

1	Research On Tibetan Speech Recognition Technology
2	Research On Language Recognition For Russian Military Speech
3	The Research On Tibetan Speech Recognition Technology
4	Research On Tibetan Speech Recognition Technology Based On Recurrent Neural Network
5	Tibetan Segmentation And POS Tagging Study
6	Research On Tibetan Speech Emotion Recognition Method Based On Multi-feature Fusio
7	Tibetan Multi-task And Multi-dialect Speech Recognition
8	Research On Emotion Recognition Technology Of Tibetan Speech By Fusion Of Multiple Features
9	Application Of Tibetan Speech Recognition Based On Active Learning In Online Education
10	Research On Speech Recognition Of Tibetan Amdo Dialect Based On Deep Learning