Font Size: a A A

A segmental approach to automatic language identification

Posted on:1994-05-25Degree:Ph.DType:Dissertation
University:Oregon Graduate Institute of Science and TechnologyCandidate:Muthusamy, Yeshwant KumarFull Text:PDF
GTID:1475390014492248Subject:Computer Science
Abstract/Summary:
Automatic language identification is the problem of identifying the language being spoken from a sample of speech by an unknown speaker. A segmental approach to automatic language identification is based on the assumption that the acoustic structure of languages can be estimated by segmenting speech into phonetic categories. Language identification can then be achieved by computing features within and across segments that describe the phonetic and prosodic characteristics of individual languages, and using these feature measurements to train a classifier to distinguish between the languages. Recognizing the difficulties involved in the development of a phonetically labeled corpus of speech, we have applied this approach using broad phonetic categories.; This dissertation addresses the following questions: What acoustic, broad phonetic and prosodic information is needed to achieve automatic identification of languages? What is the best way to present this information to neural network classifiers? What is the level of language identification possible given only this information?; In preliminary research, this broad phonetic approach was applied to a four-language (English, Japanese, Mandarin and Tamil) corpus of high quality speech. The results of this research were sufficiently promising to merit further investigation of the approach with a ten-language corpus of telephone speech consisting of mostly fluent speech from 90 speakers each of English, Farsi, French, German, Japanese, Korean, Mandarin, Spanish, Tamil and Vietnamese.; Several features based on pairs and triples of broad phonetic categories were evaluated. Pitch-based features were found to perform the worst, while features based on pairs of broad phonetic categories performed the best.; Perceptual experiments were also conducted, in which trained listeners identified excerpts of speech of one-, two-, four-, and six-second durations as one of the ten languages. The results revealed that for some languages like Korean, Farsi and Vietnamese, identification performance was poor regardless of the duration of the excerpts.; The automatic identification results indicate that while broad phonetic categories do possess language discriminatory information, the level of identification performance possible with broad phonetic information alone leaves much to be desired. Information at the phonemic or phonetic level might be required to distinguish between languages with greater accuracy.
Keywords/Search Tags:Language, Automatic, Phonetic, Approach, Speech, Information
Related items