Font Size: a A A

Short Test Length In DIF Detection Using The SIBTEST,IRT-LR And DEIT Procedures

Posted on:2017-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:C TangFull Text:PDF
GTID:2335330485477876Subject:Basic Psychology
Abstract/Summary:PDF Full Text Request
Differential Item Functioning(DIF) occurs when examinees from different groups show differing probabilities of success for an item after matching them on the underlying ability or construct that the item is intended to measure. This study primarily aimed to the performance of the power and Type I error rates in DIF analysis when using SIBTEST, IRT-LR and DFIT procedures by simulating data based on Graded Response Model(GRM).Specifically, the focus of this study was to explore how test length affect the Type I error rates in both SIBTEST, IRT-LR and DFIT procedures when other influencing factors such as, DIF patterns were manipulated.In the last three decades, important progress has been made toward more efficient statistical techniques for detecting DIF in educational assessment which more than 60 items. However, the findings are scant when it comes to detecting DIF in psychological assessment under short test length. Three levels of test length were used(10 items, 20 items, 30items). The choices for test length were not entirely arbitrary. This dissertation defined 1000 examinees in each group as a relatively large sample size. We could summarize the main conclusion as follows:(1) The findings suggested that under short test length of ten, those three procedures resulted in high Type I error rate and power. Uniform DIF for all items resulted in high power and very little Type I error rate for both SIBTEST, IRT-LR and DFIT procedures. But nonuniform DIF for all items resulted in low power for SIBTEST.(2) The findings suggested that under short test length of twenty, DFIT have the best performance. Uniform DIF for all items resulted in high power and very little Type I error rate for both SIBTEST, IRT-LR and DFIT procedures. But nonuniform DIF for all items resulted in low power for SIBTEST.(3)The findings suggested that under short test length of thirty, uniform DIF for all items resulted in high power and very little Type I error rate for both SIBTEST, IRT-LR and DFIT procedures. But nonuniform DIF for all items resulted in low power for SIBTEST.(4)The results also indicated that when test have relatively long length, there is less Type I error and power. Additionally, for the condition where the test length are ten items, there is high Type I error. For the condition where the test length are twenty items, there is less Type I error, it is better to choose IRT-LR and DFIT. For the condition where the test length are thirty items, there is least Type I error, IRT-LR have the best performance.
Keywords/Search Tags:Short test length, DIF, SIBTEST procedure, IRT-LR procedure, DFIT procedure
PDF Full Text Request
Related items