| This study explored how prompts might affect test-takers’performance in integrated writing and raters’scoring behavior in the under-investigated area of integrated writing task in the Chinese context.Three dimensions of prompt characteristics were examined,i.e.,the prompt-inherent characteristics,the prompt characteristics from test-takers’perceptions and the prompt characteristics from raters’ perceptions.For the effect of prompt-inherent characteristics on test-takers’ performance in integrated writing,1354 argumentative scripts of two prompt-inherent characteristics(topic domain and task specification)were analyzed using Coh-Metrix and regression analyses,and validated using AUA.Results showed that(1)prompts on different topic domains would elicit markedly divergent textual features;(2)prompts with different task specifications would lead test takers to adopt different modes of argumentation;and(3)such prompt effect was justified as meaningful to the writing construct.For the effect of test-takers’ prompt perceptions on their performance in integrated writing,both the perception data from 371 test takers and their writing scores were analyzed using MLM and SEM,and justified by AUA.Results suggested that(1)the integrated writing tasks mainly measured test-takers’ English language proficiency;(2)the prompt characteristics perceived by test takers had a small effect on their writing performance,in which only Prompt Knowledge functioned significantly;and(3)the meaningfulness of score interpretations in the AUA was warranted by the construct measure and the small prompt effect.For the effect of raters’ prompt perceptions on their scoring behavior,both the perception data from 30 raters and their scoring data were analyzed with respect to rater variability by MFRM.A weak perception-behavior link was identified:(1)raters,different prompt perceptions might cause different rater severity/leniency;(2)content criterion bias was related to different perceptions of prompt difficulty;(3)different prompt perceptions might translate into different rater severity/leniency biases toward test takers by means of expectations;and(4)the consistency of assessment records in the AUA was maintained due to the nuanced perception-behavior link. |