| Item bias is detrimental to the validity and fairness of language tests. It is especially important to avoid it in the large-scale high-stakes foreign language tests in China, where there are usually a large number of test-takers with diverse personal characteristics.Test specialists have done lots of researches to avoid item bias against test-takers with different characteristics. However, none of the previous studies specifically addressed ambiguity tolerance and proofreading test performance. The present study, therefore, will investigate the relationship between Chinese students' ambiguity tolerance and their performance on proofreading tests. This study is significant in that it draws the test constructors' attention to item bias, provides new insights into item bias, and offers suggestions to help the students develop an appropriate level of ambiguity tolerance.The main objective of the study is to provide test constructors with information which is as detailed as possible about the sources of item bias in order to help them avoid bias in future forms of the test.To reach the main objective of the study, the following research questions were posed:Do test-takers with different levels of ambiguity tolerance have significantly different performance when taking the proofreading test?What cognitive strategies are taken by test-takers with different levels of ambiguity tolerance?What characteristics of the proofreading test are likely to cause bias against test-takers with different levels of ambiguity tolerance?Two groups of participants were involved in this study. In the first group, 30 non-English major freshmen constituted the sample group for this investigation. They were selected because they had got the same marks in the National Matriculation English Test (NMET), so it could be said that their English proficiency was roughly at the same level according to their scores in the National Matriculation English Test. In the second group, 3 teachers participated as expert judges for this study. The expert judges were selected based on the following criteria: they had to have relevant knowledge about ambiguity tolerance and item bias, teach an EFL class, and be familiar with the characteristics of proofreading tests.The present study involves four types of methods: ANOVA, t-test, written protocol, and expert judgment. At first, ANOVA and t-test were conducted to identify items containing differential item functioning (DIF). Second, the written protocol method was conducted over the test-takers while they were taking the proofreading test, which was intended to discover the cognitive strategies taken by the test-takers with different levels of ambiguity tolerance. Finally, expert judgment was intended to decide whether each DIF item was biased or not.Through data collection and analysis, the three research questions have been answered. Among the 20 items in a proofreading test, statistical methods detected 5 items as functioning against low ambiguity tolerance (LAT) test-takers and 3 against high ambiguity tolerance (FLAT) test-takers. Among these items, some trends could be detected. We also found that LAT students and HAT students were associated with different strategies in the proofreading test. Besides, it was found that expert judgment could help in determining whether DIF items are biased, with their expertise.However, because of the fact that these results came from a rather small-scale experiment, the conclusions reached cannot be overgeneralized. Further studies need to be conducted to confirm the findings so as to avoid item bias in future tests. |