On The Scoring Validity Of Writing Assessment In PETS3

Posted on:2008-11-20

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Zhao

Full Text:PDF

GTID:2155360215464192

Subject:English Language and Literature

Abstract/Summary:

PDF Full Text Request

Writing assessment is an indispensable part of language testing, which exists in almost all kinds of large-scale tests such as CET, TEM, TOEFL, and PETS is no exception. But it is difficult to ensure its scoring reliability or scoring validity, for it is threatened by many factors, among which rating is one of the most important ones. As a large-scale test, PETS has attracted and will continually attract much more attention, but there appear no studies of the scoring validity of writing assessment in it now.This thesis intends to investigate the scoring validity of the essays in PETS3. The general research question is: To what extent are the essays in PETS3 valid in terms of scoring validity? In order to fully address the general research question, two sub-questions are investigated. (1) To what extent are the original raters consistent with the standardized raters in the assessment of the essays in PETS3 in March 2004 and that in March 2005? (2) To what extent do the original essay scores of PETS3 in March 2004 and those in March 2005 correlate with each other?In this study, the essays were systematically sampled on the basis of PETS corpus. The sampled essays were re-rated to get standardized scores to compare with original scores on sampled essays. The data were analyzed using the correlation and T-test procedures.Statistical analysis of the data yields the following findings:1) The original ratings and standardized ratings on essays in PETS3 in March 2004 are significantly correlated (r=0.869, P=0.000<0.01), and there is no significant difference between their mean scores. But the mean score of original ratings is higher than that of standardized ratings on essays in PETS3 in March 2004. It suggests that although original raters are consistent with standardized raters on essays in PETS3 in March 2004, the weakness is that original raters are a bit lenient when rating the essays.2) The original ratings and standardized ratings on essays in PETS3 in March 2005 are significantly correlated (r=0.798, P=0.000<0.01), but their mean scores are significantly different at the 0.01 level. It suggests that although original raters are consistent with standardized raters, there is a significant difference between their mean scores on essays in PETS3 in March 2005. In addition, the mean score of original ratings is higher than that of standardized ratings. It suggests that original raters are more lenient when rating the essays than those standardized raters.3) The mean score and the standard deviation of original ratings on essays in PETS3 in March 2004 are lower than those in March 2005. Not only are they not correlated with each other, but also their mean scores are significantly different. It suggests that the inter-year ratings on essays in PETS3 may not equate.The qualitative data indicate that the test prompts in PETS3 in March 2005 are not specific in terms of the writing procedures and scoring methods for test takers to follow, which results in that some testees could not write accurately. In addition, the raters are not shown how to weight the scores.This study suggests that the PETS Testing Center should not only keep the item banks stable, but also have a qualified, stable and fair rating team in order to ensure that the rating scores are stable and reliable. And with the research products of CET and TEM, the PETS Testing Center should enhance the development of the items and ensure that the items are designed in more scientific and standardized ways.

Keywords/Search Tags:

writing assessment, scoring validity, PETS3

PDF Full Text Request

Related items

1	A Comparative Study On The Validity Of Two Scoring Approaches To The HSK(Advanced) Oral Test
2	Tradition and innovation in writing assessment: A comparison of scaled-scoring and Forced-Choice Scoring
3	A Validity Study On Automatic Scoring Of English Listening And Speaking Test In College Entrance Examination Of Guangdong Province
4	A Study Of The Effects Of Student Self-Assessment On The EFL Writing Of Chinese Non-English Majors
5	'Controlled' Writing Portfolio Assessment
6	Automated Essay Scoring In The Formative Assessment Of Chinese Efl College Students' Writing
7	On The Content Validity Of Reading Comprehension In PETS3
8	Analytical Scoring Method Or Holistic Scoring Method?
9	A FACETS Analysis Of Rater Bias In Measuring Chinese Students' English Writing
10	A Study Of Automated Essay Scoring And Human Scoring Influence On Chinese Efl Learners’Writing Performance And Motivation