Font Size: a A A

An Empirical Research On Gender Differences In Language Based On The Network Media Monitoring Corpus (Chinese)

Posted on:2012-05-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y B WangFull Text:PDF
GTID:1225330335467561Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
The difference between gender languages has gained so much concern from psychology, sociology, anthropology and communication analysis. With the development of social linguistics and the rise of feminist movement since the middle of last century, the studies on gender languages difference come into a period of great prosperity. The development of specialized study in China didn’t start until the late 1970s and the early 1980s. Up to now, there are few comprehensive and in-depth studies or books on the disciplines of words’ gender difference. Most views and materials of the available studies are cited from Western scholars, and these findings are mostly completed in the social and cultural background of Western countries, and thus reflect the language phenomenon in the Western mainstream society. So they may be not compatible with the China’s actual conditions, which has been recognized by linguists of Chinese Linguistics and foreign Linguistics.The thesis is based on the Network Media Monitoring Corpus (Chinese), paying great attention to Chinese materials and focusing on localized study, in accord with the diverse, dynamic, micro and localized trend of contemporary study on the gender difference of language. And it is an empirical research based on large-scaled corpus, has a significance in the term of methodology.The study consists of nine parts as follows:Chapter One first introduces the current situation of gender language difference at home and abroad, makes a comment on the problems and shortcomings, and then points out the study’s theoretical and practical significance. With the support of National Language Resource Monitoring & Research Center (Network Media Language Branch Center), the thesis is the first study of the Chinese ontology linguistics based on the largest-scaled corpus. With the gender language study in the network language as a breakthrough, a comprehensive use of various research tools and theoretical methods, the study has a methodological significance on how to carry out the study of Chinese ontology linguistics with the data from Network Media Monitoring Corpus. Then the thesis introduces the establishment information about the Network Media Monitoring Corpus, the technical methods about the data collection, especially the classification of Blog context, and the relevant statistical methods and software.Chapter Two focuses on the gender difference on the Chinese Characters used in the Network Media Monitoring Corpus. Based on some web pages of seven famous Chinese blog websites in 2006 set up by Network Media Branch Center, which consist of 4,938. 041 contexts,1,937,732,982 characters, we carry out a statistic and analysis on the Chinese character use, and focus on the gender generality and difference. Through the analysis on data such as frequency and character species, some conclusions are reached. First, the male’s character species are obviously more than female’s. Second, the sum of female’s character times is more than male’s, and the female frequently use fewer character species than male, while the male use characters dispersedly. The study is further carried out through the contrast of frequency ratio, high-frequency characters, low-frequency characters, characters in common, characters in exclusive use, word—building ability and so on.Chapter Three studies the gender difference on the words used in the Network Media Monitoring Corpus. Based on the analysis on the gender difference in character use, the object of this chapter makes a statistic survey and analysis on the difference in word use in order to find out the gender difference and generality in the whole word use. The chapter not only pays attention to the generality and difference in the whole use of words from a macroscopic perspective, but also contrasts and analyzes carefully and from a microcosmic perspective the internal difference in several main parts of speech used by both the male and the female, and further studies the frequency ratio, high-frequency words, low-frequency words, words in common, words in exclusive use and word length.Chapter Four focuses on noun, which has the most word species, the biggest frequency ratio, the highest frequency. The contrastive study is carried out among six subtypes, namely general noun, time noun, other proper noun, person name noun, location noun, and organization noun. Judging from the overall use of noun, the overall word species, the total frequency and average frequency of males are all lower than the ones of females. There are some generality and difference on the use of different noun subtypes. When it comes to the contrast in the same subtype, there are some difference and characteristics in the word species, total frequency and average frequency respectively used by males and females.Chapter Five carries out a contrastive study on the gender difference about the use of single-word sentence. First we investigate the overall usage of single-word sentence, and analyze it statically. Then a case study is done on "de" single-word sentence. There are a series of generalities and differences about the singe-word sentence use between males and females. The generalities include that the difference in the sentence sum is not obvious, that the single-word sentence sums of general noun and verb are obviously more than the ones of other parts of speech. Except the single-word sentence of location noun, there are some similarities about the sum distribution of the other 14 types of single-word sentence. Besides the generalities mentioned above, single-word sentences consisting of different parts of speeches show obvious gender difference, and the single-word sentence sum of the same parts of speech and their ratio are different between males and females. When males and females use the single-word sentences with different parts of speech, the gap among single-word sentence sums of different parts of speech is huge, and the gap’s degree is not corresponding to the total of the sentences consisting of a certain part of speech. According to the discrepancy ratio which is available when males and females use single-word sentences with different parts of speech, a curve graph of discrepancy degree can be drawn. The gender discrepancy degree of organization noun single-word sentence is the largest, while the one of verb single-word sentence is the smallest。Chapter Six conducts a contrastive study on the gender difference in the exclamatory sentence usage in the Network Media Monitoring Language Corpus. First, the overall difference in the exclamatory sentence usage is investigated, and then the gender difference of the mood particles in the exclamatory sentences in data such as distribution, frequency, sentence sum, is analyzed. We also verify the gender difference of exclamatory words with the statistical methods and carry out a case study on the peculiar sentence-final mood particle "deshuo(的说)" of female net citizens. There are some important conclusions reached. First, the usage ratios of exclamatory sentences in the blogs of males and females are lower, taking up a very small part in the sentence sum of the blog corpus, and the exclamatory sentences with mood particles are an important component to the entirety. Second, the exclamatory sentence sum and the sum of exclamatory sentences with mood particles of females are larger than the ones of males, showing that females tend to use more exclamatory sentences and the ones with mood particles. Third, the gender generality on the usage of mood particles includes the similar word species and the same high-frequency words. Fourth, the sum of exclamatory sentences with the top 10 high-frequency mood particles shows an obvious gender difference. The frequency ratio of females is greatly higher than the one of males. The sum difference of some mood particles such as "le, a, ba(了、啊、吧)" are the largest. With the discrepancy degree of sentence sum as a standard, the largest discrepancy degree in gender difference is in the less used mood particles such as "yu(欤)","er(耳)”in the less sentence sum.Chapter Seven contrasts the interrogative sentence usage in the Network Media Monitoring Corpus. The frequency ratio of interrogative sentence is low both in the blogs of males and females, while among the interrogative sentences, the sum of wh-questions is the biggest, the second biggest is the one of "A-not-A" questions, the least is the one of alternative questions. As far as the respective proportion is concerned, the proportion of males’wh-question is higher than the one of females, while the females’proportion of "A-not-A" questions and alternative questions is higher than the males’. In the questions with interrogative markers, the sentence sum of females is larger than the one of males, and the sum gap of "ma" is biggest while that of "la" is the smallest, and the females use more interrogative pronouns in questions. There are obvious gender differences in the alternative questions, and in the three types of alternative questions focused on, the females’sentence sums are all larger than the males’. While it shows some generalities on the usages of full-form "A-not-A" questions, abbreviated "A-not-A" questions and three types of typical "A-not-A" questions. It includes that the proportion of three types of typical "A-not-A" questions and full-form "A-not-A" questions is the largest, while the abbreviated "A-not-A" questions account for a small proportion and females use more "A-not-A" questions of all the three types than males.Chapter Eight studies the gender difference in topic-selection blog corpus. The studies on the gender difference of topic-selection in China usually utilize the perspectives and data of western scholars, lacking in the deep observation on the Chinese corpus. This chapter first classifies 500,000 blog contexts from males and females into 27 topics, and then analyzes the selection tendency of males and females. The five most talked topics of males are tittle-tattle, family life, emotion and marriage, IT and sports, and they take up a proportion of 83% in the sum of observed contexts. This finding is very different with the viewpoints of previous foreign study that males tend to pay attention to politics, law, sports and economics in the casual chat. So, the topic-selection is relevant with occasion, and males tend to select different topics in the different situations. The topics talked most by females are family life, tittle-tattle, emotion and marriage, entertainment, star-chasing and so on, which account for 83.1% in the females’ contexts. And by using many statistics methods such as the Kolmogorov-Smirnov test, Scatter Diagram test and Nonparametric test-X2 chi-square test, we also testify that there is a prominent difference between the males’topic selection and the female’s.Chapter Nine focuses on the gender difference in the verbosity in the blog corpus. Based on the classification of blog contexts, we respectively observe the verbosities of 27 topics in the blogs of males and females. The verbosity of fashion and consumption topic in the blogs of males and the one of the IT topic in the blogs of females are respectively the largest, while the verbosity of job and employment topic is the smallest both in the blogs of males and females. Except for the constellation and divination and hairdressing and skin care, the verbosities of other 25 topics in blogs of females are larger than the ones of males. The proportion discrepancy of sex and physiology topic between males and females is the most obvious, while the character number difference of military and national defense topic between them is the most prominent. In order to testify whether the verbosities of different topics in the blogs of males and females are prominently different in the sense of statistics, we probe the gender difference of verbosity through different tests and parameters, such as normal authentication, paired T-test, correlation coefficient, standard verbosity and so on. Some conclusions are reached. The gender difference of verbosity is prominent. First, the discrete amplitude of males’verbosity is larger than females. Second, the amplitude of fluctuation between different topics of males’verbosity is relatively bigger, while the females’verbosity is more stable. Third, there are obvious gender differences between the standard verbosity distributions among topics.
Keywords/Search Tags:Network Media Monitoring Corpus, gender languages, Blog, statistics analysis, empirical research
PDF Full Text Request
Related items