Detecting And Measuring Rater Effects In A Pragmatics Test

Posted on:2014-07-12

Degree:Master

Type:Thesis

Country:China

Candidate:L J Xie

Full Text:PDF

GTID:2255330422955943

Subject:Foreign language teaching techniques and evaluation

Abstract/Summary:

PDF Full Text Request

The requirement that ESL learners should apply their ability to use learned skillsin real life situations triggers the growing interest of the field of language testing towelcome performance-based language testing. Such a requirement yields variousmeasures to test the ESL learners’ inter-language pragmatic knowledge, among whichWritten Discourse Completion Task (WDCT) is one often used for data collection andtesting purposes. The assessment is carried out by human raters according to certainrubrics. However, human raters may introduce errors into the final scores. As one ofthe rating errors, rater effect has been the focus of many previous researches. Thisstudy attempts to explore how common patterns of rater effect may exist in a WDCTpragmatics test by applying a MFRM approach. The author will use qualitative andquantitative methods to analyze the data and probe into the process of the raters’decision-making with an intention to find out the factors accounting for their ratingbehaviors and to provide valuable suggestions for rater training.The thesis first reviews the researches on communicative competences andperformance tests. And then it gives a brief introduction of rater effects proposed byMyford and Wolfe (2003). Main methods to examine rater effects in performance testare also within the consideration. The MFRM approach is utilized in this research.In this study,6university teachers (4Chinese teachers and2foreign teachers)were invited to rate the WDCT test which was administered to38(15males and23females) Chinese EFL university students aged from19to21. The raters rated the testindependently. The scores would be analyzed by many-facet Rasch model. Afterwards,recall interviews would be carried out with6raters respectively, aiming to analyze therating results qualitatively.The study was analyzed from four facets: the items, the examinees, the raters andthe traits. The results from the surveys indicated that the items were of significantdifferences between their difficulty levels. Among the four traits, the easiest one to get a high point was the Speech Act while Appropriateness and Expressions were of thehighest difficulty.6raters showed significant differences in terms of their ratingseverity. Rater A, a foreign teacher was the most severe rater. Most raters were foundto exhibit certain bias across both traits and examinees in their ratings. And allsignificant bias patterns could be divided into four categories. This study alsorendered eight pieces of implication for rater training and language teaching.

Keywords/Search Tags:

rater effects, many-facet Rasch model, a WDCT pragmatics test

PDF Full Text Request

Related items

1	A Many-facet Rasch Model Analysis Of Rater Effects In CET-SET
2	Using Multi-facet Rasch Model Analyzing Rater Effects In Writing Scoring
3	A Rasch-based Study On Rater Effects In Writing Assessment
4	Rater Effects Of Human-machine Scoring By Many-facet Rasch Model
5	A many-facet Rasch analysis of rater effects on an Oral English Proficiency Test
6	A many-facet Rasch measurement analysis to explore rater effects and rater training in medical school admissions
7	An exploration of test taker, rater, and item facets of the writing section of TOEFL using many-facet Rasch measurement
8	Application Of Many Facet Rasch Model On English Writing Assessment
9	Study Of Sources Of Score Variability In Performance Testing Using Many-facet Rasch Model
10	A Contrastive Study On Rater Performance Between Native And Nonnative English Raters In English Writing Assessment