Font Size: a A A

Research On Easy-confused Digit Speech Recognition

Posted on:2018-11-06Degree:MasterType:Thesis
Country:ChinaCandidate:F ZhouFull Text:PDF
GTID:2348330542967144Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Mandarin connected digit speech recognition is a very important branch of speech recognition research.It is widely used in industrial control,smart appliances and other fields.However,the performance of the current mandarin connected digit speech recognition system is still difficult to meet the needs of practical applications.Because of the characteristics of high confusion between mandarin digits,it is difficult for ordinary recognition systems to effectively recognize the easy-confused digits.Eventually leading to the recognition rate of the whole system is not high.This paper makes a thorough study on the problem of confusion between mandarin digits,and then presents a multi-parameter and multi-level recognition strategy.Firstly the digits are recognized by Mel spectral parameters based on HMM,and then take secondary classification for the easy-confused digits using other parameters and SVM.In the secondary classification,a new parameter based on group delay spectrum is introduced which is called RRCGD-CC.It is completely derived from the phase spectrum of the speech signal,and is essentially different from the traditional amplitude spectrum parameters.The experimental results show that RRCGD-CC has a certain excellence in the classification of easy-confused digits.Finally,combined with multi-parameter multi-level recognition strategy,the recognition rate of the mandarin digit recognition system increased by 2.38%.In addition,according to the difference between the tonal and vowel of easy-confused digits,the pitch frequency and the formant are used as the characteristic parameters in the secondary classification respectively,and the final recognition results are increased by 2.01% and 0.73%.
Keywords/Search Tags:Digit speech recognition, Group delay spectrum, Multi-level recognition, Formant
PDF Full Text Request
Related items