Rater Effects And Behavior:Comparing Raters' Ratings In Different Rating Conditions In A Writing Assessment

Posted on:2021-04-21

Degree:Master

Type:Thesis

Country:China

Candidate:M Y Tan

Full Text:PDF

GTID:2415330626459495

Subject:Foreign Linguistics and Applied Linguistics

Abstract/Summary:

PDF Full Text Request

Argumentative essay writing has been an effective predictor for measuring the proficiency of L2 writing ability for a long time,but its objectiveness and fairness during the rating process have still been controversial.Mainly,two reasons are contributed to this phenomenon.First,since essay writing scoring is a complex and subjective rating process,it is hard for most raters to avoid cognitive biases,subjective impressions,and rater effects.Second,many factors result rater biases and the studies on the rater effects cover varied and rich perspectives,mainly including rater background,rating methods,rating modes,and etc.With the development of scientific technique and artificial intelligence,the research on the objectiveness and reliability of the automated scoring has been varied too,but few studies discussed the changes of rater effects when scoring with the machine and its impact on rater behavior.The aim of this study is to investigate the consistency of raters' ratings and the change of rater behavior under two rating conditions.Furthermore,by comparing the differences between two rating conditions,the study can examine the reliability of raters' ratings when scoring with the machine and whether the rating condition is effective for minimizing the influence of language trait on other traits.That is to say,whether there is the halo effect under Condition 2.In this study,5 raters were invited to rate undergraduate argumentative writing under two conditions with random order and same rubric.Under Condition 1,all raters were required to give scores to all traits,including language,organization,and content traits.While under Condition 2,raters were only required to give scores to organization and content traits and iWrite rated language trait.All statistical data obtained were analyzed by MFRM.In addition,to make a further and deep study on rater behavior and perception,the author also conducted two interviews with each rater,asking the raters to review the whole rating process and explain their changes on rating behavior and rating cognition.Finally,the interview data were transcribed,marked and classified for analyzing.By comparing statistical data and interview data under two rating conditions,the result showed that all raters can score reliably and consistently in two rating conditions but they presented different severities.Raters tended to be harsher under Condition 2,especially for the content trait.Besides,content and organization traits were more differentiated and there was no significant halo effect under Condition 2.In general,most raters preferred Condition 2 when they only rated organization and content traits.They believe that iWrite can score the language trait more efficiently and fairly,and the halo effect from language trait may be minimized.

Keywords/Search Tags:

Rater effects, rater behavior, halo effect, automated scoring

PDF Full Text Request

Related items

1	Using Multi-facet Rasch Model Analyzing Rater Effects In Writing Scoring
2	A many-facet Rasch measurement analysis to explore rater effects and rater training in medical school admissions
3	Effects of scoring method and rater experience on ESL essay rating processes and outcomes
4	A Rasch-based Study On Rater Effects In Writing Assessment
5	An Empirical Study Of Rater Effects In Chinese Pragmatic Test
6	Detecting Rater Drift On An Oral English Performance Test With A Multi-faceted Rasch Model
7	Rater Effects Of Human-machine Scoring By Many-facet Rasch Model
8	Rater Bias Studies In Online Tem4 Essay Marking
9	The Evaluation And Research Of Rater Reliability With LONGFORD Method
10	Formulation And Application Of Grade Response Multilevel Facets Model In Scoring Subjective Item