Assessment is an integral part of learning as it is used to gather information about a test-taker. Those in the field of academia, such as educational policy makers, instructors, and administrators are able to use information gathered from tests to further instruction and learning decisions (Baker, 2006; Drianna, 2007; Kasper & Ross, 2013; Linn & Miller, 2005; Nitko, 2004). This study is concerned with aiding the progress of non-native language proficiency and its assessment. As a result, this study aims to bring awareness to the importance of assessment instruments with particular emphasis on non-native language pragmatic competence.;This study aims to better develop and interpret an English Pragmatic Competence Test (EPCT) which is used to assess pragmatic competence for Chinese English as a Foreign Language (EFL) learners. In order to achieve this, the researcher uses Item Response Theory (IRT) as a statistical framework to determine if differential item functioning (DIF) exists between male and female test-takers. First IRT procedures are used to visually present discrimination, difficulty, and guessing parameters. Differential item functioning (DIF) analyses are then conducted to identify potential DIF items while subsequent SIBTEST analyses are used to confirm DIF levels. This study is divided into a Pilot Study and a Final Study. The Pilot Study concerns the older version of the EPCT while the Final Study concerns the new version of the EPCT.;Using two versions of the EPCT, DIF analyses are conducted to determine gender DIF with the intention of identifying items which may need to be revised or removed in order to have an equitable assessment instrument. The population for this study consisted of approximately 4000 students in Chinese colleges. Validity and reliability analyses are also presented in regards to the newer version of the EPCT.;Gender DIF items are present in both the Pilot Study and the Final Study. The Pilot Study contained four DIF items identified as being B level and C level DIF while the Final Study contained eight DIF items identified as being B level and C level DIF. All A level DIF items were considered to be negligible as per Roussos and Stout's (1996) suggestions on levels of potential bias.;Though the instrument used in the Final Study has good reliability, results conclude that identified DIF items should be deleted in order to create a more fair and equitable test. By doing so, and repeating the methodology presented in this study, one is able to create a new EPCT which better assesses what it aims to assess: English pragmatic competence.;Lastly, this study has implications for educational policy makers, especially those in China where English proficiency tests are mandatory (He, 2010; Huang et al., 2013; Niu, 2007; Song, Cheng & Klinger, 2015). Teachers, administrators, testing instrument creators, and even students benefit from the results presented in this study. In conclusion, pragmatic competence instruction is identified as being an important part of educational programming and English language education (Hoffman-Hicks, 1992; Jianda 2007), yet countries such as China demonstrate there is still a gap between policy creation, implementation, and assessment. This study attempts to address this gap through the betterment of the EPCT by completing multiple empirical analyses. |