Font Size: a A A

Goodness-of-fit statistics for compensatory multidimensional item response models using total scores

Posted on:2004-07-21Degree:Ph.DType:Dissertation
University:University of PittsburghCandidate:Zhang, BoFull Text:PDF
GTID:1465390011961487Subject:Statistics
Abstract/Summary:
The purpose of this study was to extend the goodness-of-fit statistics conditioning on total scores to the multidimensional item response models. To investigate the statistical properties of the Pearson chi-squared goodness-of-fit statistic and the likelihood ratio fit statistic, a Monte Carlo study was performed.;This research was divided into three phases. The first step aimed to identify the sampling distribution. The second phase studied the recovery of Type I error rates for different test configurations and the last step investigated the statistical power of the Pearson chi-squared fit statistic under different conditions of model misfit.;The sampling distribution for the modified Pearson chi-squared goodness-of-fit statistic followed a theoretical chi-squared distribution. However, the degrees of freedom were different from the one identified for unidimensional IRT models (df = K − m where K refers to number of total score categories and m is the number of item parameters estimated in the model). For 1P and 2P MIRT models, one extra degree of freedom beyond K − m was indicated. For 3P models, two more degrees of freedom over K − m were indicated. In addition, the sampling distributions for a likelihood ratio based fit statistic were unstable for 1P and 2P MIRT models. For 3P MIRT models, its performance was similar to that of the Pearson chi-squared fit statistic.;Based on the revised null distribution (K − m + 1 for 1P and 2P models; and K − m + 2 for 3P MIRT models), nominal Type I error rates were observed for all three MIRT models. The recovery of Type I error rates was not affected by test length, number of examinees in the test, correlation between two ability traits and the item structures investigated in the study.;With regard to power, the procedure detected misfit associated with differences in the difficulty parameter 100% of the time. It also had moderate power in detecting departure in the MMISC parameter. For model misfitting conditions, the procedure exhibited adequate power in detecting misfit in scaling 3P MIRT data by a 1P MIRT model. For evaluating 2P data by 1P model, power was different for the tests with simple as opposed to complex structure. However, it lacked power in detecting misfit from 3P to 2P situations. Finally, lack of power was also observed for all conditions in scaling multidimensional data by unidimensional IRT models.
Keywords/Search Tags:Models, Fit statistic, Multidimensional, Item, Goodness-of-fit, Total, 3P MIRT, Power
Related items