Generating Gestures from Speech for Virtual Humans Using Machine Learning Approaches

Posted on:2015-05-25

Degree:Ph.D

Type:Thesis

University:University of Southern California

Candidate:Chiu, Chung-Cheng

Full Text:PDF

GTID:2478390017993503

Subject:Computer Science

Abstract/Summary:

There is a growing demand for animated characters capable of simulating face-to-face interaction using the same verbal and nonverbal behavior that people use. For example, research in virtual human technology seeks to create autonomous characters capable of interacting with humans using spoken dialog. Further, as video games have moved beyond first person shooters, there is a tendency for gameplay to comprise more and more social interaction where virtual characters interact with each other and with the player's avatar. Common to these applications, the autonomous characters are expected to exhibit behaviors resembling a real human.;The focus of this work is generating realistic gestures for virtual characters, specifically the coverbal gestures that are performed in close relation to the content and timing of speech. A conventional approach for animating gestures is to construct gesture animations for each utterance the character speaks, by handcrafting animations or using motion capture techniques. The problem with this approach is that it is costly in time and money and is not even feasible for characters designed to generate novel utterances on the fly.;This thesis applies machine learning approaches to learn a data-driven gesture generator from human conversational data that can generate behavior for novel utterances and thereby saves development effort. This work assumes that learning to generate from speech is a feasible task. The framework exploits a gesture classification scheme about gestures to provide domain knowledge about gestures and help the machine learning models realize the generation of gestures from speech. The framework decomposes the overall learning problem of generating gestures into two tasks: one realizes the relation between speech and gesture classes and the other performs gesture generation based on the gesture classes. To facilitate the training process this research has used real-world conversation data involving dyadic interviews and motion capture data of human gesturing while speaking. The evaluation experiments assess the effectiveness of each component by comparing with state-of-the-art approaches and evaluate the overall performance by conducting studies involving human subjective evaluations. An alternative machine learning framework has also been proposed to compare with the framework addressed in this thesis. Overall, the evaluation experiments show the framework outperforms state-of-the-art approaches.;The central contribution of this research is a machine learning framework capable of learning to generate gestures from conversation data that can be collected from different individuals while preserving the motion style of specific speakers. In addition, our framework will allow the incorporation of data recorded through other media and thereby significantly enrich the training data. The resulting model provides an automatic approach for deriving a gesture generator which realizes the relation between speech and gestures. A secondary contribution is a novel time-series prediction algorithm that predicts gestures from the utterance. This prediction algorithm can address time-series problems with complex input and be applied to other applications that require classifying time series data.

Keywords/Search Tags:

Gestures, Machine learning, Using, Speech, Human, Data, Virtual, Characters

Related items

1	Research On Chinese Speech Driven Virtual Human
2	Research On Intelligent Guide System Based On Virtual Human
3	Representational gestures, cognitive rhythms, and acoustic aspects of speech: A network/threshold model of gesture production
4	Unraveling Iconicity: Data-Driven Modeling of Iconic Gestures in Humans and Virtual Humans
5	Research On Virtual Human Sign Language Translation Technology Driven By Chinese Voice
6	Construction Of Sign Language Database Driven By Uighur Text With VRML Virtual Human
7	Design And Implementation Of Human Machine Speech Interaction System Based On ROS
8	Research On Emotion Model And Cognitive Method Of Virtual Human
9	Research On Speech Emotion Recognition And Its Application In Human-machine Dialogue System
10	Fuzzy C-means and hidden Markov models in modelling gesture recognition for human-machine interaction