Font Size: a A A

Rater Effects In English Writing Test: A Comparative Study Of Holistic And Analytic Scorings

Posted on:2011-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:J M HongFull Text:PDF
GTID:2155360302992028Subject:English Curriculum and Pedagogy
Abstract/Summary:PDF Full Text Request
Rating scale types such as holistic and analytic scales, and rater effects such as severity or leniency, consistency, and bias are the source of variance in observed ratings and thus influence the appropriation of score interpretations.This thesis investigated the effects of different rating types and rater main effects (i.e., severity, consistency, bias or interactions) on the rating procedures and ultimately on the scores by a comparative study of holistic and analytic scorings. The study does have important implications for the choice of rating scales, rater training, and implementations of rating operations to improve the way of English writing assessment.Four raters took part in this experiment. They were trained and asked to score the scripts using a holistic scale and an analytic scale respectively. Then the data collected from the two scorings were separately analyzed with the program SPSS and FACETS.The SPSS analysis showed that, in the holistic scoring, raters differed significantly in the ordering of examinees but agreed on the mean scores; in the analytic scoring, raters agreed on the ordering, but one rater significantly differed from the others in the means; scores between the two ratings were correlated but different in the means.The FACETS analysis showed that, in the both ratings, raters (a) differed significantly in the severity; (b) were fairly consistent, that is, ordered examinees consistently with the fair scores; (c) did not maintain a uniform level of severity or leniency across examinees or criteria, especially in the analytic scoring.These results (a) provide support for the notion that analytic scales are more likely to result in the consistency of ratings than holistic scales and thus favorable on the ground of reliability, in particular when raters are less experienced or differently oriented; (b) imply that holistic scales are more appropriately used in the large-scale exams on condition that all raters are experienced with holistic scorings; (c) suggest that, in any of the ratings, test developers should monitor raters constantly on the basis of the rater effects using FACETS program or other methods to get the scores as fair as possible.
Keywords/Search Tags:English writing assessments, rater effects, holistic scales, analytic scales, FACETS
PDF Full Text Request
Related items