Font Size: a A A

The Application Of The Classical Item Analysis In Quality Evaluation Of The Tests For English Majors (Grade Four)

Posted on:2004-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:L MeiFull Text:PDF
GTID:2155360092495295Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Since scientific testing approaches were introduced to China, language testing experts and teachers have done much research on reliability and validity in validation study. They have discussed the concept of test reliability and validity, recognized them as complementary aspects of testing criteria. However, item analysis is often ignored when studying quality of a test. As such, this study aims to show how item analysis is applied for quality evaluation of tests, help teachers have a better understanding of this analysis and give them some inspiration on item evaluation and construction.There are two main techniques in item analysis. One is classical item analysis, and the other is item response theory. This study adopted the former. Generally speaking, facility value (F.V.) and discrimination-index (D.I.) are important measures in the classical item analysis. F.V. measures the difficult level of an item, which is indicated by the percentage of students who answered it correctly. The higher the F.V. is, the easier the item is. On the contrary, the lower the F.V. is, the more difficult the item is. Very easy or very difficult items are not very informative to discriminate the students' proficiency and therefore affect the quality of tests. D.I. tells how well an item performs in separating the better students from the poorer students.TEM-4 is the only set of national test and designed to evaluate English teaching and learning at the end of the foundation stage. Its results affect the next teaching unit. Thus, the quality of TEM-4 is the major concern for test users. Although TEM-4 is constructed and testified by testing experts and teachers, it is unknown whether there are some defects in item construction. Three TEM-4 tests, namely TEM-4 1998, TEM-4 2000 and TEM-4 2001, were analyzed by means of quantitative and qualitative measures in thisstudy. A class of 39 sophomores was randomly selected from the College of Foreign Languages in Qufu Normal University. The statistical data were obtained after the tests were administered. Twenty teachers from this college were questionnaired. The qualitative quality evaluation of the test was made based on the feedback through the questionnaires.The findings of the study showed that the quality of most TEM-4 test items was satisfied. Information from the questionnaires also indicated that TEM-4 was a good indicator of students' performance and had a positive effect on teaching. However, data analysis revealed some items performed unsatisfactorily. For instance, certain items involve simple matching a string of words in the question with the same string in the test; or all the choices in an item are acceptable.Suggestions on multiple-choice item construction were provided building on the basis of the item analysis of the three tests. First, it must be ensured that there is only one correct option in each item and more than one correct answer is avoided. This requires test writers to take items to colleagues or natives for moderation. Second, it should be avoided that the correct option could be selected without thinking or inferring, for instance, general knowledge questions and matching questions: Third, more items with practical use should be designed for measuring students' communicative ability. Although it is not easy to test communicative proficiency by multiple-choice items, the usefulness of these items could be improved by provision of context and selection of authentic materials etc. It is hoped that these suggestions could shed some light on testing practice.Facility value in the classical item analysis varies according to test takers' proficiency. That is to say, a test that is easy for a sample of students may be difficult for another sample of students. This study only selected 39 students, thus the quality evaluation of the sample tests was not a fixed ending. Item response theory, which takes into consideration students'proficiency, needs to be further studied.
Keywords/Search Tags:Application
PDF Full Text Request
Related items