Rater's Reliability Analysis Of CEPT Speaking Test By Using Multi-facet Model Of GT

Posted on:2011-03-19

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Su

Full Text:PDF

GTID:2155360308468779

Subject:Foreign Linguistics and Applied Linguistics

Abstract/Summary:

PDF Full Text Request

As a testing method of embodying the testee's true communicative ability, speaking test has many advantages that other kind of tests cannot reach the level. Therefore, any kind of scientific and ideal language test has to contain the speaking test. Due to the characteristics of speaking test and its rating subjectivity, however, the rating job of speaking test has faced many problems and challenges. Many facets that lead to low reliability are brought out by measurement errors.Generalizability Theory, as a modern testing theory, has provided a brand-new testing method towards the speaking test reliability. First of all, in the whole observed universe score, GT tries to nail down the object of measurement, facets of measurement and their interactive correlation. On this basis, the researcher can design the fully-crossed design, nested design or the mixed design. In the second place, the researcher can process the experimental data, and then estimate the contribution of various variances towards the whole variance by utilizing the experiment design and ANOVA analysis. This procedure is the process of G-Study. On the stage of D-Study, the researcher can get their specific purpose through adjusting different measurement facet, sample size and measurement construct.This paper has done some research on rater's reliability of College English Placement Test in Hunan University by using GT. Through analysis, this study has come to the following results:1. The analytical result shows that the overall reliability of raters is very high. But it also demonstrates the obviously different severity among them. Rater 3 has the lowest severity. Rater 5 and 6 has shown great consistency during rating. What's more, the difficulty of entire speaking test is acceptable, and the discrimination index is good as well.2. Through the analysis of measuring item's facet under different conditions on D-Study, this study has come to the estimation that when the item facet has reached 6, then the G coefficient can go up to a higher level.3. On the process of D-Study, this study has estimated the reliability index of rater's facet starting from one to ten. We can figure out that how many raters are suitable under different item's facet.4. Through the adjustment and estimation of different measuring facets, this study has come to the optimal design. For instance, when the item facet is six, and rater's facet is four, it can reach the high reliability level comparing with the default design.This research has a very important meaning in that it is an original study towards the CEPT speaking test and its rater's reliability analysis. What's more, it provides experimental evidence towards its further development and perfection. The research result points out the existing problems in the oral speaking test and also makes some advice. The result also presents the different consistency among raters, which will be beneficial to select one qualified rater. In addition, the method of using GT to analyze and check reliability can shed light to the research of oral test reliability study.At last, this paper brings out the research limitations and further research towards this field.

Keywords/Search Tags:

CEPT speaking test, rater's reliability, Generaiizability Theory

PDF Full Text Request

Related items

1	The Evaluation And Research Of Rater Reliability With LONGFORD Method
2	A Study Of Rater Reliability In Achievement Test
3	Dependability Investigation On Triple-scoring Mode In Computer-Based CEPT Oral Speaking Test
4	A Study On Rater Bias Patterns In Rating CEPT Writing
5	An Empirical Study On Reliability Of CET-SET From The Perspective Of Multivariate Generalizability Theory
6	Assessment of the human factors analysis and classification system (hfacs): Intra-rater and inter-rater reliability
7	The development of the English Speaking Test: An investigation of reliability and validity
8	The Impact Of Candidate's Oral Performance On Rater's Scoring
9	An Investigation Into Rating Scales And Error Control Of The Oral Test Of China Public English Test System(PETS)
10	An exploration of test taker, rater, and item facets of the writing section of TOEFL using many-facet Rasch measurement