Research On Subword Modeling Method For Uyghur Speech Recognition

Posted on:2022-07-26

Degree:Master

Type:Thesis

Country:China

Candidate:Y L Ding

Full Text:PDF

GTID:2505306542955569

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

The task of automatic speech recognition plays an important role in the field of artificial intelligence.It is a bridge between people and smart devices.It is widely used in many fields such as automatic question answering systems and barrier-free automatic speech translation.Different from the rapid development of mainstream language recognition technology in the world,uyghur language develops slowly when resources are scarce.In the 5G era,the demand for speech recognition performance of low-resource languages is increasing.Uyghur is a typical agglutinative language with a problem of vocabulary explosion.In this paper,combined with the language characteristics of Uyghur itself,the following work is done to improve the accuracy of speech recognition:In a speech recognition system,due to the language characteristics of Uyghur language,it is difficult for the pronunciation dictionary to cover all words,and the number of out of vocabulary is very large.In order to alleviate this problem,the speech recognition system adopts subword units for modeling,and use BPE algorithm to obtain the subword unit,and do related research in this regard,the number of subword modeling units may have an impact on the recognition system influences.The BPE algorithm has certain defects due to its own characteristics.In machine translation,the decoding process of the BPE algorithm is modified to obtain the BPE-Dropout algorithm,which improves the robustness of the segmentation process,but it is not suitable for Uyghur speech recognition tasks.Based on this,this paper proposes the Improved＿BPE-Dropout algorithm to build a language model that is more suitable for sub-word modeling.It combines the advantages of the BPE algorithm and the BPE-Dropout algorithm,and the recognition performance is significantly improved.In view of the situation that Uyghur training data cannot build a strong acoustic model,the experiment uses volume disturbance and speed disturbance to augment the original data,and adopts the Chain model that is widely used in the industry.Combined with the subword unit modeling method,the recognition effect has been significantly improved compared with traditional DNN modeling.Using Kaldi toolkit and GStreamer toolkit to build an online web page version realtime speech recognition system based on a server-client structure,combining HTML,CSS and Java Script to build human-computer interaction speech recognition platform.This system can provide online speech recognition in Uyghur,English,and Chinese.The page is simple and the functions are practical.

Keywords/Search Tags:

Uyghur language, Speech Recognition, Subword modeling, BPE algorithm

PDF Full Text Request

Related items

1	Research On Uyghur Speech Recognition Based On End-to-End Modeling
2	Assessment Of Children’s Chinese Language Ability Based On Speech Recognition
3	Research On Uyghur Speech Recognition Based On Deep Learning And Data Augmentation
4	Research On Tibetan Word Segmentation And Part-of-speech Tagging Based On Pre-trained Language Models
5	Design And Implementation Of Open Psychological Assessment System With Facial Expression And Speech Recognition Functions
6	Study On The Speech Act Of Refusal In The Uyghur TV Series The Story Of Balati
7	Research On Algorithm Discrimination Of Data Mining In The Age Of Big Data
8	Research On Tibetan Speech Recognition Technology Based On Recurrent Neural Network
9	Research On Internal Language Model Elimination For End-to-end Automatic Speech Recognition System
10	Research And Implementation Of Key Technologies For Simultaneous Speech Translation Based On Low Latency Robust Modeling