Research On Chinese Speech Recognition Algorithm Based On Syllable Modeling

Posted on:2024-07-12

Degree:Master

Type:Thesis

Country:China

Candidate:Q J Wang

Full Text:PDF

GTID:2568307079973459

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

Speech recognition typically refers to the process of converting human voice signals into corresponding text,and is part of the perceptual intelligence in artificial intelligence.In recent years,with the rapid development of artificial intelligence,speech recognition technology has been widely used in vehicles,smart homes and other scenarios.The huge market demand has made improving the accuracy of speech recognition a research hotspot.In previous studies,Chinese speech recognition mainly uses end-to-end word modeling as the main modeling method.This thesis investigates the method of pinyin modeling,first using the Chinese syllable as the intermediate result of speech input,and then converting the Chinese syllable for the corresponding text.On the basis of syllable modeling,this thesis mainly does the following three tasks:(1)Combining the Connectionist temporal classification and Attention algorithms,the CTC-Attention model is built as the baseline model.On the basis of the baseline model,the CTC spike distribution problem and the Layernorm parameter oscillation problem are improved,and the CTC-Attention-TESB model is obtained.Compared with the baseline model,the syllable character error rate(CER)of CTC-Attention-TESB model is reduced by 1.08%.After language model decoding,the CER of the baseline model trained by word modeling has decreased by 6.04%.(2)Based on the CTC-Attention-TESB model,this thesis designs a multi-task learning algorithm with text modeling as an auxiliary task and syllable modeling as the main task.Experimental verification shows that the syllable-based multi-task model reduces the CER by 1.25% compared to the single-task model,outperforms other mainstream algorithms in low-resource scenarios..(3)Aiming at the current problem of mixed languages in Chinese speech recognition,this thesis studies the selection of monolingual data and the optimization of dictionaries,and conducts experiments to verify the effectiveness of data selection.

Keywords/Search Tags:

Chinese speech recognition, Syllable modeling, CTC spike distribution, Multi-task learning, Language mixing

PDF Full Text Request

Related items

1	Syllable-based Method Of Tone Recognition For Chinese Continuous Speech
2	Research On Isolated Speech Recognition In Noise Environment
3	Research On Chinese Syllable Evaluation Approach After Automatic Speech Recogniton
4	SVM And HMM Combination Of Design And Implementation Of Chinese Speech Syllable Recognition Algorithm
5	Research And Implementation Of Mongolian-Chinese Mixed Language Speech Recognition System Based On Deep Learning
6	Based On The Characteristics Of Cv Syllable Minority Language Recognition Research
7	Research And Application Of Speech Recognition Based On Syllable Modeling
8	Syllables and concepts in large vocabulary speech recognition
9	Mandarin Syllable Recognition System Based On Asat Frame
10	Research On Speech Emotion Recognition Based On Multi-Attention Mechanism And Multi-Task Learning