| China‘s Standards of English Language Ability(CSE)is the first full-range language proficiency scale for Chinese English learners.Since its official release in2018,it has garnered increasing research attention in teaching,learning and assessment.The CSE can be applied as a self-assessment(SA)tool for learners.The premise of its effective use is to ensure the scale validity.Validity is the key to the quality of a scale and runs through the whole process of scale development and use.Even if a scale has been put into use,it still needs to accumulate more validity evidence during its application process to facilitate the scale validity and revision.Most of the existing research has rarely involved the validity of the CSE in its practical application process.Furthermore,some CSE scales,such as the CSE SA scales have not been appropriately validated on a large scale.This study purports to enrich current limited validation studies on the CSE with a special focus on the CSE SA scale for listening proficiency.Theoretically underpinned by Messick‘s unified validity theory,this study seeks for both quantitative and qualitative evidence to justify the structural,content and external aspects of the construct validity of the scale in question.Three research questions will be addressed:(1)to what extent can the CSE SA scale for listening proficiency distinguish learners at different proficiency levels?(2)to what extent do the can-do statements fit the CSE‘s unidimensional model of listening proficiency?(3)what is the correlation between participants‘ SA results and their performance in external international and national EFL proficiency tests?Methodologically,the quantitative data were collected via SA questionnaires.The questionnaire contains all the 26 nine-level can-do descriptors of the CSE SA scale of listening proficiency.Those questionnaires were mainly distributed to university-level students with various discipline backgrounds across China,but also targeted for postgraduates,doctoral students,etc.The present study collected a total of 1395students‘ SA on the scale.The Rasch model was utilized to analyze students‘ SA data to investigate the item separation and difficulty hierarchy of the scale,and the unidimensionality of the scale.Pearson correlation was employed to explore the relationship between students‘ SA results and external test scores.The qualitative analysis was based on data from semi-structured interviews.10 students were invited to give their understanding of the scale descriptors to further explore the content aspect of the validity.This study found that:(1)The scale was overall valid to discriminate learners at different language proficiency levels with a wide continuum of item difficulty.However,this argument was somehow threatened by the possible absence of items targeting for higher ability learners.Furthermore,the difficulty of the majority of the CSE levels increased as the intended difficulty hierarchy,except the CSE levels 7,8and 9.Suspected reasons for the disorder of difficulty level may be the vague description,unclear contextualization and unfamiliar language tasks in scale descriptors;(2)Most of the scale can-do statements fit the unidimensional Rasch model of listening proficiency,with an exception of 4 misfit items.Poor fit may be attributed to the sampling effect and deficiency in the item description;(3)Students‘ SA results were found to be correlated with test scores of CET-4,CET-6,IELTS and TOEFL to a moderate to strong degree.This indicated the relevance of the underlying construct between the scale and external measurements.The scale may provide some information of students‘ performance on the external objective tests.By adopting a mixed research method,this study from the perspective of learners enriches different aspects of the validity evidence of the CSE to a certain extent.The voices from the students,the major scale users of the CSE SA scales,may give some implications for future revision and improvement of the CSE.As a result,the CSE may embrace a wider and more effective use in the field of teaching,learning and assessment. |