Text-Speech Alignment Based On General Speech Recognition

Posted on:2015-07-06

Degree:Master

Type:Thesis

Country:China

Candidate:C Y Li

Full Text:PDF

GTID:2298330431964398

Subject:Electronic and communication engineering

Abstract/Summary:

With the development of the Internet, we get free resources is very easily on the Internet. When we create a speech corpus, we often use much manpower, material and long time, this way lead to lots of wasting. If these resources are available, it will save much manpower. We propose a method to improve this phenomenon. If we get these free resources on the Internet, it will save a lot of time and money.In this paper, the data sources from CCTV, that is why hostâ€™s pronunciation is national standards and there are many good texts, the host recorded for specific contexts in general circumstances. The data has a good rhythm characteristics and context information. The data can be obtained easily. This detailed work of this paper includes the following points.(1) We propose a generic text to speech recognizer based on the alignment of technology (Universal Identifier is speech recognition systems of window7and Google Voice Recognition), the technology is based on HMM forced aligned technology, we compare the initial identify the results and the original text, use Force-alignment (FA) and pattern matching technique, we can extract the aligned part, which can greatly reduce the time to build a corpus. We use an iterative mechanism applied to the recognition, the aligned part can be effectively maximized. Finally, we cut out the aligned, and the aligned result are be combined, then the data is used to build a single Chinese sentence recognizer.(2) As an evaluation for Universal Recognition text-speech Alignment technology, the paper proposed word recognition system based on a three-phoneme model, the alignment of text to speech recognizer obtained corpus, as the training database of this build speech recognition, the test data is the part that not aligned. As follows, firstly based on HTK recognition system, we obtain a three-phoneme model for the identification of any Chinese words. Secondly, by using CMN algorithm, the corresponding recognition rate is improved. Finally, the recognition results are Pinyin format, and this format is not easy to be read. Therefore, this paper has used the Perl scripting language.we can obtain Chinese characters for people to understand.

Keywords/Search Tags:

Text-Speech Alignment, Tied-Triphone, Universal Recognition, HMM Model

Related items

1	Research On Automatic Speech-Text Alignment For Mongolian Long Audio
2	Research Of Mandarin Text-Speech Alignment Based On SailAlign
3	Research On Unannotated Long Chinese Speech Text-speech Alignment
4	Study And Improve On The Mongolian Speech Recognition System
5	Researching Of The Mongolian Acoustic Model Based On Speech Recognition
6	Study And Design Of Specific Character Speech Recognition Based On Embedded System
7	Research And Application Of Speech And Text Automatic Alignment Technology Based On Text Similarity Algorithm
8	Speech-Text Soft-Alignment With Semantic And Monotonic Constraints For End-to-End Speech Recognition
9	Research Of Long Speech And Text Alignment
10	Research On Dbn-based Continuous Speech Recognition