A Comparison Between Multiple Assessments In EFL Writing Test

Posted on:2016-10-12

Degree:Master

Type:Thesis

Country:China

Candidate:L M Wu

Full Text:PDF

GTID:2285330461462511

Subject:Foreign Linguistics and Applied Linguistics

Abstract/Summary:

PDF Full Text Request

Teacher acting as the sole rater produces many problems and disadvantages in the field of language testing, such as heavy workload for teachers, potential biases etc., which not only bothers the language educators, but also affects studentsâ€™ evaluation results. In the past few decades, adapting alternative assessments in EFL teaching has increasingly become the focus of researches. But few studies have set foot in the comparison of all three types of assessmentsâ€”self-, peer- and teacher-assessments by applying the method of Multifaceted Rasch model. The present study will make a comparative analysis among self-, peer-, and teacher-assessments in EFL writing tests and utilize the Multifaceted Rasch model. "Raterâ€™s Effects" (Myford & Wolfe,2003) and "Learner Autonomy" (Holec,1981) will be employed as theoretical basis to support the application of alternative assessments in the study.There are 80 second-year non-English major students and two experienced EFL teachers taking part in the study carried out in Lanzhou University. All the students are required to assess their own writings and other three studentsâ€™ writings according to certain evaluation rubrics; and the teachers are also asked to assign the studentsâ€™ writings using the same rubrics. The results of studentsâ€™ self-assessments, peer-assessments and teacher-assessments are collected and processed after the assessment. By applying Multifaceted Rasch Measurement, these collected data are analyzed through the computer program FACETS 3.22 (Linacre,1999)â€”the mainstream Multifaceted Rasch Measurement software. At last, the scoring results collected from AES tool www.pigai.org are processed together with self-, peer-and teacher-assessments. Through the Spearmanâ€™s rank correlation coefficient p, this additional comparison is processed in order to obtain the paired consistency.In the present study, there are totally three facets processed through Multifaceted Rasch Measurement, which include writers, raters and criterion items respectively. FACETS can produce all the results on the same linear measurement scale so that they can be compared together. After the FACETS analysis, several results can be obtained:(a) a FACETS map, (b) ability measures and fit statistics for each writer, (c) a severity estimate and fit statistics for each rater, (d) a bias analysis for raterÃ—writer interactions, and (e) a difficulty estimates for each assessment criterion. The FACETS map can provide visual information about differences that might appear among different facets, such as differences in severity/leniency among raters and deferent ability estimates among writers. There are totally five research questions in the present study:1) To what degree do writersâ€™ ability estimates, ratersâ€™ severities, and assessment criteriaâ€™s difficulties fit the model? 2) How do self-assessors, peer-assessors, and teacher-assessors differ when assessing writersâ€™ ability estimates? 3) To what degree do self-assessors, peer-assessors, and teacher-assessors show bias towards writersâ€™ abilities, and what types of bias are they? 4) How do self-, peer-, and teacher-assessments compare in term of assessment criterion difficulty? 5) To what degree are the self-assessors, peer-assessors, teacher-assessors and the online assessing website www.pigai.org externally consistent? All these questions can be answered through the results of FACETS and the Spearmanâ€™s rank correlation coefficient p.According to the research results, student-raters showed wider range of severity/leniency compared to teacher-raters; all three types of different raters revealed bias through analysis for rater-writer interactions, and all manifested independent bias patterns; compared with high-ability students, low-ability students were more easily grading their own and peersâ€™writings with biased assessments; however teacher-raters tended to assess low-ability students more leniently then high-ability ones. In regard to assessment criterion difficulties, teacher-assessments were most diverse ones; while peer-raters showed the narrowest assessment criterion difficulties. Moreover, the criterion "content" was rated with most harsh scorings, and criterion "mechanics" was rated most leniently. As to the online assessing tool, its scoring results were not externally consistent with the self- and teacher-assessments in the study. In the last part, some implications and suggestions in regard to alternative assessments, rater trainings, assessments quality improvement, and EFL teaching were brought up in brief.

Keywords/Search Tags:

EFL writing, self-assessments, peer-assessments, teacher-assessments, Multifaceted Rasch Measurement, FACETS, comparative analysis

PDF Full Text Request

Related items

1	A Conversational Analysis Of Positive Assessments As Responsive Actions In Mandarin Daily Interaction
2	An Evaluation of the Relative Utility of Two Demand Assessments for Identifying Negative Reinforcers
3	On the cognitive validity of science performance assessments
4	The development of frequency-based assessments of vocabulary breadth and depth for L2 arabic
5	A comparison of classical test theory and item response theory methods for equating number-right scored to formula scored assessments
6	Acceptability assessments of elliptical sentences in context
7	Construction And Application Of "Five Steps And Two Assessments" Teaching Mode In Senior High School English Writing Based On The View Of English Learning Activities
8	Perspectives of Bilingual Speech-Language Pathology Assistants (SLPAs): Are They Prepared to Assist with Non-Biased Assessments
9	Rater Effects In English Writing Test: A Comparative Study Of Holistic And Analytic Scorings
10	A Study Of The Effectiveness Of Multiple Assessments In Extensive Reading For Vocational College Students