| In the global village,English as a major means of communication plays a significant role in assisting Chinese people in telling Chinese stories well and promoting China’s soft power via effective English language communication.As such,there is an increased need for many talents with high written or spoken communicative competence to well perform these missions.Therefore,how to select the talents with excellent language competence deserves people’s effort,to some extent,which largely relies on English language assessment in the Chinese context of teaching English as a foreign language(EFL).However,the rating accuracy and fairness of language assessment,especially oral language,need to be further explored due to its strong subjectivity.The design principles and propositions,rating methods,rating standards,the candidates and raters of the oral English language assessment may cause rating errors,and these factors may affect the reliability,validity and fairness of the assessment.Among these factors,raters are one of the most complex factors affecting the scores of oral English assessment.Moreover,in the process of scoring,raters will interact with many other aspects,such as rating contexts,rating criteria,rating methods,candidates and testing tasks and so on.To address these issues,in the previous studies,researchers at home and abroad mainly focus on the interaction between raters and candidates,testing tasks,rating time and rating criteria and raters’ own background(e.g.,raters’ gender,language background,teaching experience and rating experience,etc.),while the studies of raters’ language backgrounds and rating contexts are few.In addition,most of the previous studies focus on writing assessment,and relatively few studies on oral language assessment.Moreover,many researchers divert their attention on large-scale standardized language tests,to some degree,neglecting regular teaching tests,especially in the classroom-based contexts.To fill in the research gap,therefore,this study aimed to study the impacting factors of raters’ language backgrounds and rating contexts on the reliability and validity of Chinese university students’ oral English assessment based on the generalizability theory(G-theory).To reach the purpose,this study selected 30 audio samples of spoken English in a certain university in Tianjin,and invited 10 English teachers in universities to voluntary rate the selected samples twice in two different rating contexts.Among the 10 raters,half of them are native speakers from the United States or Canada;others are Chinese.By analyzing the scoring results produced by these two cohorts of raters in two respective contexts,the author in this thesis explored the impact of rating contexts and language backgrounds on the validity and reliability of Chinese university students’ oral English assessment.The study in the thesis found that,firstly,the rating contexts had no significant impact on the reliability and validity of the scoring results.Namely,rating the same samples at different contexts did not have any impact on the scoring results.Besides,the scoring results in the two contexts were with low validity and reliability.However,different language backgrounds had a significant impact on the reliability and validity of oral English assessment.The internal consistency of native raters was higher than that of Chinese raters.Specifically,the convergent validity and discriminant validity of native raters’(American or Canadian)scoring results were higher than that of Chinese English teachers in both rating contexts.The results of this study suggest that Chinese university EFL teachers should improve their both English language ability and assessment literacy so as to ensure oral English assessment with high validity and reliability.Moreover,universities or speaking contest organizers would better recruit native English teachers as raters to rate English speaking so as to improve the scoring accuracy.If native English teachers are not enough,Chinese EFL teachers had better experience professional development programs or activities,which are strongly suggested to include knowledge and skills about assessment in general and language assessment in particular. |