Font Size: a A A

A study of meta-linguistic features in spontaneous speech processing

Posted on:2007-03-22Degree:Ph.DType:Thesis
University:University of Southern CaliforniaCandidate:Wang, DagenFull Text:PDF
GTID:2458390005486425Subject:Engineering
Abstract/Summary:PDF Full Text Request
Speech is a crucial component in human computer interaction. While tremendous progress has been made in automatic speech recognition, speech transcription, which is the output of automatic speech recognition, is far from providing all the information that one could retrieve from speech. For example, prominence, rate of speech all carry important information in speech and are crucial to speech perception. Inclusion of such meta-linguistic features can facilitate better machine recognition and understanding of speech. In this thesis, we illustrate various research progress in study of the meta-linguistic features.; Firstly, we proposed an acoustic measure of speech rate without speech recognition. It includes various spectral, temporal and smoothing algorithms to identify the syllable nucleus, which is easily converted to speech rate measure by dividing the length of speech. The parameters of the algorithm is optimized by Monte-Carlo simulation followed by a sensitivity analysis.; Secondly, we extend the syllable nucleus estimation algorithm to provide an acoustic measure of prominence. It fuses features such as syllable duration, spectral intensity and pitch patterns to score prominence on a continuous scale. We proposed parametric shape based features to identify the pitch shape and demonstrate its usefulness in prominence detection. Beyond direct evaluation on manually transcribed prominence, linguistic correlation with part of speech was used to measure prominence.; Thirdly, we use rate and prominence feature and many other acoustic features to automatically detect utterance boundary and disfluency. We extract features from pitch breaks and their surrounding regions. Such process includes a piecewise linear stylization of pitch coutour. A rule based approach and a machine learning approach are proposed and compared on various boundary and disfluency events.; Lastly, we use all the above features to study speech acts detection, specifically distinguishing between wh-questions and statements. The prominence of first two syllables and their timings are found to be useful for such task.
Keywords/Search Tags:Speech, Features, Prominence
PDF Full Text Request
Related items