Font Size: a A A

Research On Processing Text After Speech Recognition Based On Railway Train Operation Context

Posted on:2021-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:Q WangFull Text:PDF
GTID:2392330605958086Subject:Traffic Information Engineering & Control
Abstract/Summary:PDF Full Text Request
Speech recognition technology provides great convenience for people's life,and it plays an irreplaceable role in human-computer interaction.It is difficult to apply the speech recognition technology in specific fields because of various factors,such as environmental noise,the voice tone of the speaker,and the lack of domain knowledge of the speech recognition engine and so on.At present,speech recognition technology is not widely used in the context of railway train operation.Due to the standardization and specialization of railway train operation terminology,and some alphanumeric pronunciations have special requirements,resulting in a low correct rate of railway train operation terminology.This thesis focuses on the above problem,and the natural language processing method is used to process the text after speech recognition,and the error-detecting and error-correcting methods are used to optimize the recognized results,thereby reducing the word errors in the text after speech recognition in the context of railway train operation,and realizing the application of speech recognition technology in this field.There are some research contents and results in this thesis.(1)Build n-gram model and key words and collocations of professional terminology base in railway train operation context to implement error-detecting.Firstly,train the corpus to build a bi-gram model and a tri-gram model,and formulate extraction rules to build key words and collocations of professional terminology base.Secondly,propose a weight distribution method based on n-gram language model to calculate the contextual harmony degree of words for the first error-detecting in the partial context,and combine the key words and collocations of professional terminology base to calculate collocation aggregation degree of words for the second error-detecting in long-distance semantic layers of text.Finally,double-layer progressive error-detecting method is adopted to accurately locate the error points of the text after speech recognition.(2)Research on error-correcting method based on confusion sets.Pinyin approximate matching and exact matching method combined with scattered string reorganization strategy are used to construct real-word confusion set and pinyin confusion set,in order to correct the real-word error and scattered string error in the text.And the fusion probability of contextual harmony degree and semantic similarity degree is used as the support degree of confusion words,and select the word with the biggest support as the optimal error-correcting suggestion to output.(3)Research on alphanumeric error-correcting method based on keywords rule table.Aiming at the scattered string error caused by the special pronunciation letters and numbers in the railway train operation terminology,the keywords rule table is constructed,and the rulematching method is used to implement error-correcting based on the keywords rule table.(4)The error-detecting and error-correcting methods of the text after speech recognition application and result analysis.Error-detecting and error-correcting methods of the text after speech recognition in the railway train operation context is applied to the railway train operation training system.After experimental verification,the method can effectively improve the correct rate of speech recognition by 12.77%,and this method is of great significance for the application of speech recognition technology in the field of railway train operation filed.
Keywords/Search Tags:Speech Recognition, Railway Train Operation, Weight Distribution Method, Confusion Sets, Error-detecting and Error-correcting
PDF Full Text Request
Related items