Font Size: a A A

An Empirical Case Study On Validating Zipf’s S Law Of Measuring High School Students’ Productive Vocabulary

Posted on:2015-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:L Q DengFull Text:PDF
GTID:2297330431461008Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Productive vocabulary refers to the words that learners could use freely through tasks of speaking and writing, which can reflect learners’actual lexical proficiency, comparing to the receptive vocabulary. While in the past years of study, most of the tools are designed to estimate receptive vocabulary, among which very little of them designed for productive vocabulary is available for calculating the numbers of productive vocabulary that students produced freely. Therefore, the author tries to introduce Zipf’s law-a power law, which is proved to be of validity in studying the word frequency in many fields, to estimate the free productive vocabulary size in real students’writing.Since very little study concerning this subject can be found, this paper aims to take a first step to study the validity of using Zipf s law to measure productive vocabulary. By comparing the results derived from the estimation by Zipf’s law and the requirement about the core vocabulary made by The Basic Teaching Requirements of the English Course for Senior High School for high school students in Shanghai, it can tell us whether it is useful to adopt this method in the study of vocabulary or not. If the result shows that Zipf s law has relative high validity and reliability in estimating productive vocabulary size, then it would surely contribute a lot to the study of vocabulary estimation and vocabulary teaching and learning. If not, this study can still serve some useful information for the future study.Research question of the study:Is it applicable to use Zipf s law to estimate the productive vocabulary size of high school students in Shanghai?Research design:the subjects of the study are students in grade1and grade2chosen from different levels of language proficiency of senior high schools in Pudong District of Shanghai. Results can be achieved by analyzing the first1,000most frequent words of LFP that appeared in their free writing to estimate their productive vocabulary size with equations from Zipf’s law. By comparing the results with The Basic Teaching Requirements, results showed that: In the first round of test,5students from School A, which is above the intermediate level in Pudong District, were tested by Zipf s law. The estimated vocabulary size for these students was far beyond the level required by The Basic Teaching Requirements, making it of huge difference from what we expected, since it is impossible for senior high school students to reach a productive vocabulary size of about4,000words in such a short time. Therefore, it can be concluded that Zipf s law did not show high validity in this round. Based on this point, we cannot jump to a conclusion that this law is suitable for the high school students. Thus, we carried out the second round of test to verify its validity. In this period,2groups of15senior high school students in grade two from School B (one of the top schools) and School C(one of the lower-level schools) in Pudong District were tested. This time, we chose their writing, which was assigned as their homework, to test their vocabulary size with Zipf s law. According to the result,15students in the test reached a productive vocabulary size of around1,600words, showing that the result was close to the requirement of The Basic Teaching Requirements. Therefore, we believe that this round of test is of validity to some extent, making it clear that it is possible to use Zipf s law with validity under some circumstances with some controls being made.But only with two rounds of test cannot help us be clear of the practical value of using Zipf s law. Because the calculation of words by Zipf s law is mainly based on the large amount of linguistic data, and no such corpus can be provided for us to carry out the experiment, therefore, we believe that it is necessary to stop the study. Based on the two rounds of test, conclusion can be drawn at this stage:for the fully application of Zipf s law, many factors must be taken into consideration, among which the source of the corpus and size of sample text are the major factors that influence the validity and reliability of Zipf s law. Reflections from the two rounds of test are as follows:Two reasons may cause the failure of the pretest:one is the limited size of the sample texts that could influence the result; another is that we did not control students’writing state. To be more specific, students may receive interventions from their teacher before the test, making it possible that they would frequently use the words that appear in the textbook. Of course, the words they adopted from the textbook were not the words they would naturally produce but words produced by intervention outside the test. Therefore, these words would influence the actual performance of Zipf’s law, leading to the increase of high frequency vocabulary, and making their productive vocabulary size is extremely larger than what we expected.According to the study, we find that Zipf’ s law is an empirical law generating from large amount of linguistic data by foreign scholars. When used to estimate foreign leaners, it would work with high validity. However, for users in China, that would be another case. Because vocabulary size of Chinese learners is relatively small, and their environment of vocabulary study would different from that of students at abroad, thus making the result less stable. Moreover, as postgraduate research project, there is no enough and satisfying condition for us to produce such a large corpus. For future study, a large corpus involving millions of words must be required.The high frequency wordlists used in this study were built on the vocabulary used by foreign students also. Some frequent words in the lists are not necessarily appropriate to be used to analyze Chinese students. Therefore, the actual learning environment of learners in China should be taken into consideration by researchers and teachers when producing wordlists for learners. Appropriate high-frequency words used by most of the learners who are at different levels of language proficiency should be considered as well.All of the above problems, which would influence the performance and estimation result of Zipf’s law and lead to the unstable result, are worth noticing and being dealt with by researchers. For the future study, these problems should be noticed and solved in a reasonable way. Only in this way can we truly test its validation in measuring productive vocabulary size of Chinese learners.
Keywords/Search Tags:vocabulary knowledge, measuring productive vocabulary, Zipf’s law, LFP
PDF Full Text Request
Related items