Font Size: a A A

An Analysis Of The Reliability And Validity Of Rating Procedures In Test Of English Proficiency At Level A

Posted on:2018-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:K DongFull Text:PDF
GTID:2335330566457806Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Performance assessments refer to the observation and evaluation of the performances of examinees when they try to fulfil tasks in the real language testing situation.Recently,the performance assessment has gradually gained its popularity in the second language testing due to its emphasis on the assessment of the execution of some certain language abilities in the examinee being tested.As one of the performance assessments,the English oral proficiency test is intended to assess the oral proficiency levels of examinees.But in the real test situation,the test scores of the oral proficiency test may be distracted by the ability of an examinee,the severity and consistency of a rater,the difficulty of tasks and rating criteria and the biased interactions between all these variables.Thus further investigation is needed and essential for us to examine the reliability and validity of the rating procedure.Currently,the most common way to examine the reliability and validity of the rating process is by using the many-facet Rasch model.However,most studies being conducted nowadays focus on the accuracy and effectiveness of the English writing tests,very few researches concern about the reliability and validity of the rating procedure in the English oral proficiency test.Under this circumstance,the present study is trying to find out the possible facets that may affect the accuracy of the raw scores with the help of Facets program;in this way,the author aims to ensure that the whole rating process of the English oral proficiency test is reliable.The present study takes TEP Oral at Level A,held in Beijing International Studies University,as an example to investigate the reliability and validity of the rating procedure.All together there are 382 examinees and 36 raters involved.Besides,both holistic and analytic rating scales are applied during the test.After the test,SPSS is employed to do some descriptive statistical analyses,and then the Facets software is used for further investigations about the impact of each facets on the test scores of TEP Oral at Level A.Some questions are expected to be answered after the analysis.Firstly,is it possible to distinguish the different levels of examinees' abilities in both holistic and analytic rating scales? Secondly,are raters differing in their severities or leniencies in using the holistic and analytic rating scales? If so,to what extent? Thirdly,are raters consistent in using rating scales? Is there a significant difference in the reliability of both holistic and analytic rating scales? If so,which one is more reliable? Fourthly,do raters share the same agreement on the rankings of examinees in both two rating scales? Finally,is each score category,holistically or analytically,used reasonably? If not,is it overused or misused?The findings of the study indicate that the application of both holistic and analytic rating scales can distinguish different levels of examinees abilities.Besides,raters show significant differences in their severities in both two rating scales;and the comparisons show that raters tend to be more lenient in analytic rating scale.What's more,raters are intra-correlated across the two rating scales.Concerning to the test scores,raters demonstrate their agreement on the rankings of examinees.In the end,the score categories in both two rating scales are reasonably used.The findings of the study illustrate the reliability and validity of the rating procedure of TEP Oral at Level A,and the results can be used as useful indicators to the English oral proficiency test as well as the oral English teaching.
Keywords/Search Tags:performance assessment, reliability, validity, raters
PDF Full Text Request
Related items