Font Size: a A A

Validation Of An Analytic Rating Scale In Writing Assessment Via MFRM And TAPs Based Evidence

Posted on:2019-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y CaiFull Text:PDF
GTID:2405330596461167Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Writing test,an indispensable component in the language performance assessment,has a place in numerous domestic and overseas language testing systems.However,it is still confronted with issues and challenges of inequity and unreliability,which are generated by the subjective factors involved in the writing assessment.Factors such as writing tasks,testing environment,scoring procedures may exert a negative influence on the testing outcomes,among which,scoring procedures are one of the most important considerations.Therefore,it is essential to improve the scoring validity in order to settle the problems(Chen Jianlin 2016).A primary parameter to measure the scoring validity is the validity of the rating scale(Shaw 2007).The rating scale is the only tool for raters to evaluate the candidates‘ writing ability with.A robust rating scale should enable raters to differentiate the candidates‘ writing ability clearly.The requisite writing constructs and reasonable-set categories in a rubric can help raters to reach agreement and keep consistency in their rating process,and thus invite fewer errors into the rating results.Studies on writing assessment and scoring validity have already attracted researchers‘ attention at home and abroad these years.However,the validation studies of the rating scale from the perspective of scoring outcomes are still vacant in this field.The extant research concerning the rating outcomes mostly focus either on the comparative studies among different rating scales or on the interactive studies between raters and the rating scale.They concentrate less on the analytic rating scale itself.What‘s more,previous studies concerning rating scales seldom collected multiple validity evidences from both qualitative and quantitative approach.Because the analytic rating scales have good diagnostic qualities in providing the writing teaching with more detailed feedback and they can help teachers to exactly evaluate students‘ writing abilities in all aspects,this study will focus on validation of an analytic rating scale for a writing assessment.Enlightened by the modern validity theory(Messick 1989),it will investigate both scoring outcomes and the rating process and analyze the evidences based on both Multi-Facet Rasch Model(MFRM)and think-aloud protocols(TAPs),with a purpose to seek the deficiency of the analytic rating scale and to monitor raters‘ behavior in applying the scale.It is also expected to provide constructive suggestions for the revision of the analytic rating scale.In this study,a College English writing test was conducted and 55 writing samples(5 pieces of essay for rater training and 50 pieces of essays for actual rating)were collected through Pigaiwang.Five raters were involved in the rating task.Each rater was required to score the same set of 50 writing samples independently according to the same analytic rating scale with four dimensions,namely lexical level,content level,syntactic level,cohesion and coherence.During the whole scoring process,the raters were invited to do think-aloud,reporting verbally their understanding,reflection and evaluation of the rating scale.A Multi-Facet Rasch Model analysis was conducted with raw rating scores.Based on both qualitative and quantitative study,the research came to the following results:1)The rating scale can efficiently differentiate the examinees‘ writing abilities.The 50 candidates‘ writing abilities are in the above-average level and are divided into about 5 levels;2)The four dimensions are basically reasonably designed,but some descriptors might be overlapping,irrelevant,ambiguous,or interferential;3)The categories in the analytic rating scale need adjustments.Some categories are either not even used or overused;some of them show disorder in the difficulty organization;and the distances between every two adjacent categories are not always symmetrical;4)Five raters presented significant variances on rater severity.Five raters except one maintained high inter-reliability and self-consistency;and central tendency occurred occasionally;According to the results of the data analysis,this research can directly reveal the deficiency of the scoring scale itself and detect the subjective errors occurring in the rating process.It is helpful to put forward specific and effective suggestions on how to revise the rating scale.The analysis of raters‘ behavior provides more inspiration for rater training.The setting of four dimensions of the rating scale and raters‘ different emphasis contribute to innovative improvement and revolution in the teaching of English writing.This research enriches the modern validity theory,provides insight for the validation framework of English writing rubrics,and offers references for other language performance assessments and future related studies.
Keywords/Search Tags:writing assessment, analytic rating scale, validation, Multi-Facet Rasch Model, TAPs
PDF Full Text Request
Related items