The Voice Source in Speech Production: From Models to Applications

Posted on:2015-03-16

Degree:Ph.D

Type:Thesis

University:University of California, Los Angeles

Candidate:Chen, Gang

Full Text:PDF

GTID:2478390017990849

Subject:Engineering

Abstract/Summary:

The voice source contains important lexical and non-lexical information. The non-lexical information can convey, for example, prosodic events, emotional status, as well as cues pertaining to the uniqueness of the speaker's voice. A better understanding, and eventually a better model of the voice source, would benefit various speech applications, such as speech recognition, speech synthesis, speaker identification, age/gender classification, as well as clinical assessments.;This dissertation has three main goals. The first is to better understand the voice source through analyzing images of the vocal folds using laryngeal high-speed videoendoscopy (HSV) recordings. A new automatic method is proposed to compactly summarize the overall spatial synchronization pattern of vocal fold vibration for the entire laryngeal area from HSV data. Additionally, a new measure is proposed to adequately capture perceptually-important variations in glottal area pulse shapes, which are extracted from HSV data.;The second goal is to study the acoustic consequence of a physiological vocal-fold vibration pattern---the glottal gap effect, and apply our findings to a gender classification task of children's voices. Voice source related measures are found to improve classification accuracy, especially for younger (10-15 year old) speakers.;The third goal is to propose new voice source models and evaluate them in different applications. In the first application, a new source model and a noise-robust automatic source estimation algorithm are proposed to estimate the voice source from speech signals. Results in both clean and noisy conditions show that the proposed model and algorithm are robust in accurately estimating the voice source signal. The second application is to use the proposed source model for vowel synthesis. Perceptual listening experiments show that the proposed model provides a better perceptual match to the target voice than do traditional models.

Keywords/Search Tags:

Voice, Model, Speech, Proposed

Related items

1	Distributed Speech Recognition And Voice XML Standardlanguage In Vivid-Ring Application
2	Research On Embedded Speech Synthesis Technology
3	Design And Realization Of One-Shot Vehicular Voice-User Interface System
4	The Voice Source in Speech Production: Data, Analysis and Models
5	Parametric model based speech enhancement
6	Minimizing state error propagation in low-bit rate speech codec for voice over IP
7	Acoustic analysis of voice/speech characteristics in nonsymptomatic gene carriers of Huntington's disease: Does the speech/voice differ from normal controls
8	Research On Speech Recognition Based On The Degree Of Fatigue
9	The Research And Application Practice Of The Evaluation Model Of Speech Quality Based On E-model
10	Research On Speech Recognition Using Voice Conversion Approach