Font Size: a A A

Research On Chinese Speech Transcription Punctuation Prediction Based On Deep Learning

Posted on:2020-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:X LiuFull Text:PDF
GTID:2428330575465326Subject:Engineering
Abstract/Summary:PDF Full Text Request
A standard automatic speech recognition(ASR)system typically produces a transcribed text without any punctuation,which makes the text poorly readable and can cause subsequent processing problems.Thus,automatic punctuation prediction solves this problem by inserting appropriate punctuation into the text.In the prediction of punctuation,vocabulary features or prosodic features are the two main directions of research scholars.In this situation,it is very vital to use supervised learning techniques to train the model.In this case,it is very important to train the model using supervised learning techniques.In order to effectively solve the problem of automatic prediction of punctuation in speech transcription,a variety of methods are proposed.According to the application characteristics of these methods,they can be roughly divided into three categories:lexical features,prosodic features and a combination of tlhese two features.This thesis will use the text features and prosodic features combined with the current hottest deep learning method to predict the punctuation of Chinese Speech transcription text.The main innovations of this dissertation are as following:1.Since the pre-trained word vector is trained on a large-scale corpus,and compared with a randomly initialized word vector,the speed of the training model is also improved.In addition,taking into account the differences in the English language format,a string of characters between the characters are continuous without gaps,and between the English words are separated by spaces,no segmentation,so the word segmentation is also very important for Chinese punctuation prediction.According to the above two points,we propose a Chinese punctuation prediction method based on word vector,which is a method based on attention mechanism based bidirectional cyclic neural network,based on the influence of word segmentation and pre-trained word vector on training.The proposed algorithm retains the characteristics of the basic algorithm,and can save the context information of the punctuation in the front,middle and back directions of the text,so that the type and position of the punctuation can be more effectively identified.2.For the particularity of Chinese speech,considering the contribution of pitch in English is not very obvious,but it is different in Chinese.Therefore,the second research content of this thesis is the study of speech feature and word vector in Chinese speech transcription punctuation prediction,mainly to investigate the effect of pitch feature on punctuation prediction.The speech data is passed through the ASR system to obtain punctual undivided text,pitch and pause time.Then a three-stage cyclic neural network model based on long and short time memory cells is used for the recovery of punctuation marks in speech recording.In the first phase,text features are learned in a large text corpus.The second phase combines text features with pause time.The third phase combines text features,pause time,and pitch features to adapt the model to the speech domain.After experimental verification,we found that our model has a better prediction effect.
Keywords/Search Tags:punctuation prediction, word segmentation, word embedding, BRNN, attention mechanism, pitch, LSTM
PDF Full Text Request
Related items