Font Size: a A A

Research On And Application Of HMM-based Distributed Speech Recognition System

Posted on:2011-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:G X JiangFull Text:PDF
GTID:2178360302974586Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Traditional embedded speech recognition systems have many drawbacks such as complex architecture, high hardware requirements, incapability of updating vocabulary, lack of flexibility and poor robustness. From the perspective of network applications, this paper investigates the methods of classification, updateing, and training of Mandarin thesaurus on the server of the HMM-based distributed speech recognition (DSR) system. The paper also studied the optimization of endpoint detection and feature extraction at the terminal. The phrase is recognized by multi-word identification with the language model. Furthermore, most speech recognition computation is transferred from terminal to server, and recognition result is sent back through network. The main innovations and efforts in this work are as follows:(1) Based on the idea of distributed information processing, speech training and recognition are placed on the server, and feature extraction locates at the terminal. The extracted feature is formatted into data packets and sent to server for responding.(2) Label technology is adopted to classify and update thesaurus on the server side. The new words collected from the Web are tagged with labels and stored into the database. As soon as the server gets enough corpuses of these newly added words, training process is performed to create the acoustic and language models. Furthermore, this thesis proposes the concept of personalized speech recognition, which adapts the general speech model and makes it personalized through collecting specific user's corpus explicitly or implicitly. The method can improve the recognition accuracy for specific users, without losing universality advantage of general-purpose speech recognition system.(3) At the terminal, with the MFB endpoint detection method, optimized by look-up table and fixed-point conversion, the voice activity detection and feature extraction are performed concurrently and effectively. The feature is sent to the server for speech recognition computation (4) The network server and embedded terminals are connected through network, on which a news recommending system (intelligent network radio) is established as an example. The example system contains a prototype of distributed speech recognition. Experiment results show that the system has gained feasibility, low quantity of computation, and the thesaurus can be dynamically updated.
Keywords/Search Tags:distributed speech recognition, embedded systems, hidden Markov models, vector quantization
PDF Full Text Request
Related items