Font Size: a A A

Research On Subword Modeling Method For Uyghur Speech Recognition

Posted on:2022-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y L DingFull Text:PDF
GTID:2505306542955569Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The task of automatic speech recognition plays an important role in the field of artificial intelligence.It is a bridge between people and smart devices.It is widely used in many fields such as automatic question answering systems and barrier-free automatic speech translation.Different from the rapid development of mainstream language recognition technology in the world,uyghur language develops slowly when resources are scarce.In the 5G era,the demand for speech recognition performance of low-resource languages is increasing.Uyghur is a typical agglutinative language with a problem of vocabulary explosion.In this paper,combined with the language characteristics of Uyghur itself,the following work is done to improve the accuracy of speech recognition:In a speech recognition system,due to the language characteristics of Uyghur language,it is difficult for the pronunciation dictionary to cover all words,and the number of out of vocabulary is very large.In order to alleviate this problem,the speech recognition system adopts subword units for modeling,and use BPE algorithm to obtain the subword unit,and do related research in this regard,the number of subword modeling units may have an impact on the recognition system influences.The BPE algorithm has certain defects due to its own characteristics.In machine translation,the decoding process of the BPE algorithm is modified to obtain the BPE-Dropout algorithm,which improves the robustness of the segmentation process,but it is not suitable for Uyghur speech recognition tasks.Based on this,this paper proposes the Improved_BPE-Dropout algorithm to build a language model that is more suitable for sub-word modeling.It combines the advantages of the BPE algorithm and the BPE-Dropout algorithm,and the recognition performance is significantly improved.In view of the situation that Uyghur training data cannot build a strong acoustic model,the experiment uses volume disturbance and speed disturbance to augment the original data,and adopts the Chain model that is widely used in the industry.Combined with the subword unit modeling method,the recognition effect has been significantly improved compared with traditional DNN modeling.Using Kaldi toolkit and GStreamer toolkit to build an online web page version realtime speech recognition system based on a server-client structure,combining HTML,CSS and Java Script to build human-computer interaction speech recognition platform.This system can provide online speech recognition in Uyghur,English,and Chinese.The page is simple and the functions are practical.
Keywords/Search Tags:Uyghur language, Speech Recognition, Subword modeling, BPE algorithm
PDF Full Text Request
Related items