Font Size: a A A

Hierarchical methods in automatic pronunciation evaluation

Posted on:2010-01-23Degree:Ph.DType:Dissertation
University:University of Southern CaliforniaCandidate:Tepperman, JosephFull Text:PDF
GTID:1445390002490182Subject:Electrical engineering
Abstract/Summary:
Technology that can automatically categorize pronunciations and estimate scores of pronunciation quality has many potential applications, most notably for second-language learners interested in practicing their pronunciation along with a machine tutor, or for automating the standard assessments elementary school teachers use to measure a child's emerging reading skills. The many sources of variability in speech and the subjective perception of pronunciation make this a complex problem. Linguistic hierarchies---in speech production, perception, and prosodic sturcture---help to conceive of the variability as existing on multiple simultaneous scales of representation, and offer an explanatory order of precedence to those scales. These theories are beginning to gain widespread attention and use in traditional speech recognition, but experimenters in pronunciation evaluation have been slow to embrace them. This work proposes using theories of hierarchical structure in speech to inform a chosen computational framework and scale of analysis when performing automatic pronunciation evaluation, on the assumption that they will offer improvements over non-hierarchical methods and can be used to rate pronunciation with performance comparable to that of inter-human agreement.;Three example applications here illustrate novel hierarchical approaches over three different standard scales of analysis---the phoneme, the word, and the phrase. Each one makes use of hierarchical knowledge in at least two ways. First, the acoustic models for evaluation are defined on time-scales below the one of interest (similar to the common practice of using strings of phoneme models to represent words in speech recognition), based on a nested conception of parallel linguistic scales. Then recognition results obtained from these models are aggregated in an ordered structure appropriate to the task and using a computational framework best suited to instantiate the hierarchy. Results show statistically significant improvements over baseline methods that do not use these novel time-scales for modeling variability nor make use of a structured hierarchy in combining the cues derived from those models.
Keywords/Search Tags:Pronunciation, Hierarchical, Methods, Evaluation, Scales, Models
Related items