Font Size: a A A

Case-based Reasoning Solution For Chinese Keyword Detection

Posted on:2014-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:D L ZhouFull Text:PDF
GTID:2268330422450638Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Keyword spotting (KWS) detects specific word in an unconstrained speech stream. It is a technology in the field of Automatic Speech Recognition (ASR). Comparing with Continue Speech Recognition (CSR) technology, it is easier to be constructed for being unnecessary to recognize all contexts of the speech. Moreover, due to CSR technology’s incapability to some application, KWS plays a very important role in them, such as:dialogue system, spoken document retrieval and speech context surveillance.A new KWS method based on sustained learning framework is presented here to overcome some shortages in traditional HMM based methods. In HMM based approaches, detection mainly based on acoustic model, which can be seen a compact representation of acoustic knowledge of human pronunciation included in training data set. However, detection performances are often affected seriously by the problem of mismatch between acoustic model and testing speech. The main reason is that training data does not include complete knowledge so that performance decline sharply when some acoustic phenomena in testing speech is not included in training data. Considering that human pronunciations and their acoustic representations are easy to be affected by various factors, it is impossible to construct a training data set with complete knowledge, and then the mismatch is inevitable during the recognition. In this paper, a sustained learning strategy is adopted to resolve the problem, in which service providers or users are also involved into the task accumulating acoustic knowledge for KWS system. For this, a KWS technology with sustained learning ability is needed. We report a new method based on Case Based Reasoning framework.Firstly, A HMM based keyword detection system is described, its performance is tested as a baseline performance. Then our CBR based method is proposed. The reason why CBR framework is adopted is discussed. All sections in this framework are described in detail, include:keyword case representation based on clustering acoustic symbol, tree index structure for case base, elastic matching strategy for case searching, case searching algorithm and estimation of posterior probabilities of keyword hypothesis, and feedback processing. Finally, a improved algorithm method is proposed in the chapter4. A complex criterion is given to meet the requirement of discriminative both in acoustic feature space and language semantic space and to consider the effectiveness how clustering number affect case base searching. An agglomerative hierarchical clustering algorithm is adopted here to cluster acoustic symbols. Experiment was conducted to show detection performance and to prove the ability of sustained learning.
Keywords/Search Tags:Keywords detection, Hidden Markov model, CBR, Clusteringalgorithm, Sustained learning
PDF Full Text Request
Related items