Font Size: a A A

Text-Speech Alignment Based On General Speech Recognition

Posted on:2015-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:C Y LiFull Text:PDF
GTID:2298330431964398Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the development of the Internet, we get free resources is very easily on the Internet. When we create a speech corpus, we often use much manpower, material and long time, this way lead to lots of wasting. If these resources are available, it will save much manpower. We propose a method to improve this phenomenon. If we get these free resources on the Internet, it will save a lot of time and money.In this paper, the data sources from CCTV, that is why host’s pronunciation is national standards and there are many good texts, the host recorded for specific contexts in general circumstances. The data has a good rhythm characteristics and context information. The data can be obtained easily. This detailed work of this paper includes the following points.(1) We propose a generic text to speech recognizer based on the alignment of technology (Universal Identifier is speech recognition systems of window7and Google Voice Recognition), the technology is based on HMM forced aligned technology, we compare the initial identify the results and the original text, use Force-alignment (FA) and pattern matching technique, we can extract the aligned part, which can greatly reduce the time to build a corpus. We use an iterative mechanism applied to the recognition, the aligned part can be effectively maximized. Finally, we cut out the aligned, and the aligned result are be combined, then the data is used to build a single Chinese sentence recognizer.(2) As an evaluation for Universal Recognition text-speech Alignment technology, the paper proposed word recognition system based on a three-phoneme model, the alignment of text to speech recognizer obtained corpus, as the training database of this build speech recognition, the test data is the part that not aligned. As follows, firstly based on HTK recognition system, we obtain a three-phoneme model for the identification of any Chinese words. Secondly, by using CMN algorithm, the corresponding recognition rate is improved. Finally, the recognition results are Pinyin format, and this format is not easy to be read. Therefore, this paper has used the Perl scripting language.we can obtain Chinese characters for people to understand.
Keywords/Search Tags:Text-Speech Alignment, Tied-Triphone, Universal Recognition, HMM Model
PDF Full Text Request
Related items