Font Size: a A A

Using natural language processing for child language analysis

Posted on:2014-08-26Degree:Ph.DType:Dissertation
University:The University of Texas at DallasCandidate:Hassanali, Khairun-nisaFull Text:PDF
GTID:1455390005492278Subject:Computer Science
Abstract/Summary:
Language is an important part in a human being's life. Identification of language deficiencies such as the use of certain syntactic constructs and words in children with Language Impairment (LI) will allow clinicians to create intervention programs that focus on these deficiencies. Most measures of language development and LI are computed manually which is time consuming. Automatic computation of language development measures will allow for a quicker and more extensive study of language samples. This dissertation presents the use of Natural Language Processing (NLP) techniques for the study of child language analysis. We first explore the use of NLP in child language analysis from a syntactic perspective. We present AC-IPSyn, a system that automates the computation of the Index of Productive Syntax (IPSyn) (Scarborough, 1990), a metric that measures syntactic complexity in child language. AC-IPSyn performs at levels comparable to human scoring. We use syntactic features based on IPSyn in the automatic prediction of LI. We then conduct a study of grammatical errors in child language transcripts. We annotate child language transcripts for verb related grammatical errors and present an automatic grammar error detection system. We then explore the use of NLP from the semantic aspect of language. The ability to produce coherent language is a hallmark of language development. We conduct a study of coherence using child language narratives. We annotate story telling narratives for coherence, narrative structure, and narrative quality elements. We then use coherence and narrative related features in the automatic prediction of coherence and LI. To avoid the labor intensive process of manual annotation of narrative structure, we explore automatic identification of topics from narratives. For this purpose, we use Latent Dirichlet Allocation (LDA), a topic modeling method. Our results show LDA is useful for detecting the topics that correspond to the narrative structure. The usage of coherence, narrative, and topic related features were useful in the automatic prediction of coherence and LI. Finally, we generalize our work to second language assessment. We present an automatic scoring system for child speech on an assessment test for children studying English as a foreign language. We explore the use of speech, grammar, and coherence features. Our results show that automatic scoring of child language is promising.
Keywords/Search Tags:Language, Automatic, Coherence, Features, Explore
Related items