Font Size: a A A

Towards social virtual listeners: Computational models of human nonverbal behaviors

Posted on:2015-10-08Degree:Ph.DType:Thesis
University:University of Southern CaliforniaCandidate:Ozkan, DeryaFull Text:PDF
GTID:2475390020952070Subject:Computer Science
Abstract/Summary:
Human nonverbal communication is a highly interactive process, in which the participants dynamically send and respond to nonverbal signals. These signals play a significant role in determining the nature of a social exchange. Although human can naturally recognize, interpret and produce these nonverbal signals in social contexts, computers are not equipped with such abilities. Therefore, creating computational models for holding fluid interactions with human participants has become an important topic for many research fields including human-computer interaction, robotics, artificial intelligence, and cognitive sciences. Central to the problem of modeling social behaviors is the challenge of understanding the dynamics involved with listener backchannel feedbacks (i.e. the nods and paraverbals such as ``uh-hu'' and ``mm-hmm'' that listeners produce as someone is speaking). In this thesis, I present a framework for modeling visual backchannels of a listener during a dyadic conversation. I address the four major challenges involved in modeling nonverbal human behaviors, more specifically listener backchannels: (1)High Dimensionality: Human communication is a complicated phenomenon that involves many behaviors (i.e dimensions) such smile, nod, hand moving, and voice pith. A better understanding and analysis of social behaviors can be obtained by discovering the subset of features relevant to a specific social signal (e.g., backchannel feedback). In this thesis, I present a new feature ranking scheme which exploits the sparsity of probabilistic models when trained on human behavior problems. This technique gives researchers a new tool to analyze individual differences in social nonverbal communication. Furthermore, I present a feature selection approach which first looks at the important behaviors for each individual, called self-features, before building a consensus. (2)Multimodal Processing: This high dimensional data comes from different communicative channels (modalities) that contain complementary information essential to interpretation and understanding of human behaviors. Therefore, effective and efficient fusion of these modalities is a challenging task. If integrated carefully, different modalities have the potential to provide complementary information that will improve the model performance. In this thesis, I introduce a new model called Latent Mixture of Discriminative Experts which can automatically learn the temporal relationship between different modalities. Since, I train separate experts for each modality, LMDE is capable of improving the prediction performance even with limited amount of data. (3) Visual Influence: Human communication is dynamic in the sense that people affect each other's nonverbal behaviors (i.e. gesture mirroring). Therefore, while predicting the nonverbal behaviors of a person of interest, the visual gestures from the second interlocutor should also be taken into account. In this thesis, I propose a context-based prediction framework that models the visual influence of an interlocutor in a dyadic conversation, even if the visual modality from the second interlocutor is absent. (4) Variability in Human's Behaviors: It is known that age, gender and culture affect people's social behaviors. Therefore, there are differences in the way people display and interpret nonverbal behaviors. A good model of human nonverbal behaviors should take these differences into account. Furthermore, gathering labeled data sets is time consuming and often expensive in many real life scenarios. In this thesis, I use "wisdom of crowds" that enables parallel acquisition of opinions from multiple annotators/labelers. I propose a new approach for modeling wisdom of crowds called wisdom-LMDE, which is able to learn the variations and commonalities among different crowd members (i.e. labelers).
Keywords/Search Tags:Nonverbal, Human, Behaviors, Social, Models, Listener, Communication, Different
Related items