Font Size: a A A

Comparability and linking in direct writing assessment: Benchmarks, discourse mode, and grade level

Posted on:2002-10-25Degree:Ph.DType:Dissertation
University:Arizona State UniversityCandidate:Osborn Popp, Sharon ElizabethFull Text:PDF
GTID:1465390014450231Subject:Education
Abstract/Summary:
Increasingly, direct assessments of writing performance are being included in large-scale testing programs despite concerns regarding reliability and validity. Issues regarding assessing student writing across discourse modes and measuring growth across grade level have generated interest as well as concern. The purpose of this study was to examine the effects of: (a) different scoring benchmarks on scores for the same papers, (b) discourse modes on scores for papers by the same students, (c) grade level on scores for papers written in a single discourse mode, and (d) grade level on scores for papers written in different discourse modes. Raters scored writing samples from students in Grades 3, 5, and 8 against a common rubric. Raw ratings were analyzed using multi-facet Rasch models. Raw ratings and Rasch-estimated student abilities, trait difficulties, and rater leniency-severity parameters were examined. Ratings of the same essays differed in magnitude and relative rank when scored against different sets of benchmarks. Ratings of papers written in different discourse modes by the same students had similar features such as similarly rank-ordered analytic trait difficulties. However, ratings for different modes led to substantial inconsistencies in how students were classified based on various performance standards. Ratings of student writing in a single mode increased with grade level. Comparisons of writing ability on a common task appear to be possible across grade levels, given benchmarks chosen from the multi-grade set of sample papers. The validity of comparing ratings of student writing in different modes across grade levels remains questionable. Results indicate that directly adjusting for discourse mode may be a promising approach to assess general writing quality across modes, but adjusting for mode may not be sufficient to allow for successful linking across grade levels. The benchmark papers used to operationalize the rubric score points strongly influenced the ratings of students' papers as well. Results of this work add to growing cautions and concerns regarding the use and interpretation of large-scale writing assessment scores and suggest the need for careful research on the nature of benchmark papers and the processes used to select them.
Keywords/Search Tags:Writing, Grade level, Discourse mode, Papers, Benchmarks, Ratings
Related items