Font Size: a A A

Rating Differences Of Experienced Raters In Assessment Of English-Chinese Translation Tasks

Posted on:2007-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:H WenFull Text:PDF
GTID:2155360185951164Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Performance tests typically require raters to judge the quality of examinees' written or spoken language relative to a rating scale;therefore, scores may be affected by raters- an important variable in the assessment of performance test. The potential variability in rater judgments is the area that has been of particular concern with performance tests. Despite a body of work investigating the issue of subjective rating over the last decade or so, raters' differences in their rating translation performance is still not well understood. The purpose of this study is to investigate the differences of experienced raters with similar background in their assessment of two translation tasks (sentence translation from English to Chinese and discourse translation from English to Chinese) using analytic rating scales.An analysis of raters' differences in their translation assessment is of crucial significance to the present language testing practice. Firstly, almost no research has been conducted into rating differences in translation assessment. The study described in this report will not only contribute to knowledge inthis area but also serve as a starting point for further research in this important area;Secondly, This study will also provide insights into test score interpretation. This article reports research using translation examiner verbal reports that attempts to gain insights into the rating process. Thus, the question that how test scores could be interpreted could be better understood. Findings from the current research are also of great importance to both rater training and improvement of the current rating scale.In order to achieve the above mentioned purpose, the following major research question was put forward: i.e. what are differences of experienced raters in their rating behavior when scoring two translation tasks (i.e. sentence E-C translation and discourse E-C translation)? In order to fully address the major research question, four sub-research questions were investigated. They are as follows:?How is the students' English-Chinese sentence translation assessed by the raters in this research?â– How is the students' English-Chinese discourse translation assessed by the raters??Is there any difference among raters in their qualitative judgments to the same translation??What elements do these raters focus on while marking these two translation tasks and how do they focus on these elements?In this study, ten translations came from the samples of NationalEnglish Contest in Shanxi region of China in 2004 and another ten translations came from a timed impromptu translation test for English majors. Six trained, experienced raters took part in the study, providing scores for two sets of 10 translations. The first set was 10 scripts of E-C sentence translation and the second set was 10 scripts of E-C discourse translation. Raters were asked to provide think-aloud protocols describing the rating process as they rated each translation. Raters then provided retrospective written reports. A coding scheme developed to describe the think-aloud data allowed analysis of the heeded elements in assessment, the way on which these elements were focused and the interpretations the raters made of the scoring categories in two analytic rating scales.Frequency analyses and one-way ANOVA analyses of rating scores indicated that raters in this study differed significantly in their scoring of the following categories: "expression" in E-C sentence translation, "faithfulness" and "fluency" in E-C discourse translation. Raters' verbal reports as well as their retrospective written reports of the decision-making process were analysed, with a view to better understanding of raters' difference in scoring and how test scores could be arrived. It was found that raters' wide score distribution lay in their different perceptions of the nature of each candidate's performance to a great extent. The further descriptive analysis showed that the raters differed greatly in their decision-making process in two translation tasks. In both E-C sentence translation and E-C discourse translation,although all the raters in this study focused on the same elements, the way in which they focused on these elements were clearly different. Specifically, in sentence translation assessment, raters focused on 17 elements including comprehension of the source text, mistranslation, omission, incorrect addition, uncontrolled translation, lexical choice, Chinese convention, fluency in expression, stylistic inappropriateness, task completion, error type, error gravity, error frequency, layout, spelling mistakes, raters' affective response and comparison with other scripts. According to raters' verbal reports, another seven elements were also focused on in discourse translation: obscurity of language, sentence structure, pauses of sentences, coherence of sentences, literary grace, succinctness of language and global communicative effect. The focus on these elements involves two things: locating problems and deciding on the gravities of the problems located. Therefore, the way on which the above-mentioned elements were focused was mainly investigated in terms of these two aspects. The claim was that raters showed great similarity in locating problems in E-C sentence translation but differed greatly in deciding on the gravities of problems located. While in E-C discourse translation, raters differed greatly in both locating problems and deciding on the gravities of problems located. The varied nature of the raters' perceptions, with regard to how these elements were judged, suggest that it would be nearly impossible to say how any one translation score had been reached.We present implications of these findings on interpreting test scores, rater training, the role and design of the current rating scale.
Keywords/Search Tags:E-C sentence translation, E-C discourse translation, translation evaluation, think-aloud verbal protocol, inter-rater difference, analytic rating scale
PDF Full Text Request
Related items