A Contrastive Study On Rater Performance Between Native And Nonnative English Raters In English Writing Assessment

Posted on:2013-04-09

Degree:Master

Type:Thesis

Country:China

Candidate:Y Hu

Full Text:PDF

GTID:2235330374490673

Subject:Foreign Linguistics and Applied Linguistics

Abstract/Summary:

PDF Full Text Request

It is considered by many relative professionals whether nonnative English raters(NNE) and native English raters (NE) consistently assign valid scores on writingperformance assessment. Rater performance greatly plays a role in fairness andvalidity of writing test. This study is mainly to explore differences of raterperformance between nonnative English raters (NNE) and native English raters (NE)who use the same holistic scale on assessing English writings of ESL students. FourNNE raters and four NE raters were adopted to evaluate447essays simultaneouslywritten by ESL students of International College in Hunan University.The data were analyzed adopting FACETS software based on Multi-facet Raschmodel upon Item Response theory to investigate differences between the two groupson rater severity, intra-rater consistency, inter-rater consistency, and bias interactionstowards examinees of levels with different ability logits. The following results arelisted. Firstly, the raters differed from each other in severity, and there weresignificant differences on the holistic ratings awarded by the two groups of raters, aswell as their intra-rater consistency were acceptable with one exception of the NErater. And then, generally speaking, the NNE raters were far more lenient than the NEraters in scoring holistic ratings. Furthermore, these two groups slightly differed inintra-rater consistency and the NE raters showed higher internal consistency.Inter-rater consistency for the NNE raters was much greater than that of the NE raters.Lastly, the two groups of raters exhibited different bias interactions towardsexaminees with different levels of ability estimate. The two groups of raters had nodifference of bias interactions towards those examinees with extreme high abilityhigher than2.00logits and those with extreme low ability less than-2.00. Theymaintained much higher consistency towards the examinees with high ability and lowability. However, the NE raters were more likely to be more lenient than the NNEraters towards the examinees of ability logits within the range of1.00to1.99. Andthey showed many large bias interactions towards the examinees with the middleability, which indicated they had much confusion on assessing those examinees’ writings,Though the NE raters were more severe than the NNE raters, it couldn’t reach aconclusion that the NNE raters were not eligible, because the NNE raters were betterin reliability and showed less significant bias interactions towards examinees ofdifferent ability level. This study had important implications for English writingtesting and teaching. In addition, the application of FACETS on comparing raterperformance of writing assessment between two groups was a new try.

Keywords/Search Tags:

Multi-facet Rasch Model, Rater performance, Severity, Consistency, Bias analysis

PDF Full Text Request

Related items

1	Using Multi-facet Rasch Model Analyzing Rater Effects In Writing Scoring
2	A Rasch-based Study On Rater Effects In Writing Assessment
3	A Many-facet Rasch Model Analysis Of Rater Effects In CET-SET
4	A many-facet Rasch analysis of rater effects on an Oral English Proficiency Test
5	Detecting Rater Drift On An Oral English Performance Test With A Multi-faceted Rasch Model
6	Rater Effects Of Human-machine Scoring By Many-facet Rasch Model
7	A many-facet Rasch measurement analysis to explore rater effects and rater training in medical school admissions
8	Study Of Sources Of Score Variability In Performance Testing Using Many-facet Rasch Model
9	Reliability Study On Large-Scale Online Writing Scoring
10	Detecting And Measuring Rater Effects In A Pragmatics Test