Font Size: a A A

Lexical Research Based On A 3-6-year-old Mandarin-speaking Children Spoken Corpus

Posted on:2011-08-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:T X ZhangFull Text:PDF
GTID:1115360302999784Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
The research based on corpus linguistic study focuses on the analysis of language use of children at the age of 3 to 6 with the methods of segmentation, annotation and word frequency statistics. With the analysis of errors of children's language use which reveals children's language competence and knowledge we put forward our own ideas of Chomsky's hypothesis.The dissertation includes the following five chapters.Chapter one:introduction. We introduce the earlier research and show the research content, emphasis, difficulties and guiding theories. We point out that language errors made by children are very helpful for linguistic study according to children developing linguistics and the new subject proposed by Chomsky.Chapter two:the corpus construction of Mandarin-speaking children at the age of three to six. In this chapter, we describe the construction process of corpus in detail, including the principle of data acquisition and the like. We select forty Mandarin-speaking children at the age of three to six based on sampling principles, video their one hour conversations with adults and then acquire 40 hours'data. Then, we convert the video data to text files manually and then revise them and get about 550,000 characters, including more than 100,000 characters of children's. On the basis of these texts, we do text segmentation by ICTCLAS software and proofread it manually and then construct some databases, including:vocabulary, frequencies, and the distribution of various word-classes, word length and mean length of utterances.Chapter three:the quantitative analysis of notional words, including class of words and distribution of words of high frequency. It shows that the number of notional words increases with age especially noun, verb, adjective, quantifier and adverb and there are more monosyllabic words than double syllabic words.Chapter four:the statistics of preposition, conjunction, auxiliary and modal particle. We find that the number of function words is small but the frequency is high. The relative concentration of function words with high frequency is also significant. There is use of trans-class of function words, such as "BA"(preposition, quantifier, noun and verb), "GEI"(verb, preposition, auxiliary), "ZAI"(preposition, verb and adverb), and so on.Chapter five:the analysis of errors of children's language use. The errors show that children's ability of word-making and the development of language competence is part of cognitive factors which are related with genetic factors. This analysis supports Chomsky's hypothesis with facts.Also the analysis of errors of words use shows that children's basic knowledge of morphology, syntax and collation can be applied to the acquisition and use of other morphology or syntax as prototype. The errors reflect children's grasp of language knowledge.The innovation of this study mainly consists of three aspects as follows:First, we make use of video resource of 40 children at the age of 3-6 with the method of sample collecting and convert 550,000 characters including more than 100,000 characters of children's language. And we construct corpus of different property such as speaker, gender, age, word-class, frequency and so on.Second, the description and analysis are carried out based on the code corpus and the statistic includes word-class, frequency of usage, cumulative frequency and mean length of utterances.Third, language errors of lexicon made by children reflect the relationship between children's cognitive and language competence. Errors caused by prototype generalization and simple analogy show that acquisition of language is part of cognitive factors. It is helpful to understand Chomsky's hypothesis from the perspective of cognitive.
Keywords/Search Tags:child language, errors, language competence, language knowledge, corpus
PDF Full Text Request
Related items