Font Size: a A A

Examining alternative scoring rubrics on a statewide test: The impact of different scoring methods on science and social studies performance assessments

Posted on:2007-02-14Degree:Ph.DType:Dissertation
University:University of South CarolinaCandidate:Creighton, Susan DabneyFull Text:PDF
GTID:1448390005973762Subject:Education
Abstract/Summary:
There is no consensus regarding the most reliable and valid scoring methods for the assessment of higher order thinking skills. Most of the research on alternative formats has focused on the scoring of writing ability. This study examined the value of different types of performance assessment scoring guides on state mandated science and social studies tests. A proportional stratified sample of raters were randomly assigned to one of four scoring groups: checklist, analytic rubric, holistic rubric, and generic rubrics. A fifth method, the weighted analytic rubric, was included by applying an algorithmic formula to the scores assigned by raters using the analytic rubric. A comparison of the mean scores for the five scoring groups suggests that there may be a difference in the way raters applied the rubric for each group. Although the literature suggests that it is possible to achieve high levels of inter-rater reliability, across forms of scoring, phi coefficients of moderate strength were obtained for three of the four constructed-response items. Results for each scoring group were compared indicating that item complexity may impact the level of inter-rate, reliability and the selection of the most reliable rubric for each discipline. Analytic rubrics appear to achieve more reliable results with less complex items. A multitrait-multimethod approach was utilized to investigate the external validity of the social studies and science tasks. As expected, there tended to be a stronger association between the PACT science constructed-response scores with scores based on science multiple-choice scores than between the science constructed-response scores and the writing ability subtest scores. A similar pattern was seen with social studies items. These results provide some evidence for the validity of the performance assessments. A post study survey completed by raters provided qualitative information regarding their thought processes and their primary focus during the scoring process. An analysis of this data suggests that raters using alternative rubrics may have employed different strategies to score student responses.
Keywords/Search Tags:Scoring, Rubric, Social studies, Different, Alternative, Science, Raters, Performance
Related items