Font Size: a A A

Comparison of bootstrap standard errors of equating using IRT and equipercentile methods with polytomously-scored items under the common-item nonequivalent-groups design

Posted on:2008-12-02Degree:Ph.DType:Dissertation
University:The University of IowaCandidate:Cho, YoungWooFull Text:PDF
GTID:1445390005466876Subject:Education
Abstract/Summary:
The need for equating scores from tests with polytomously-scored items has been increasing because of some perceived limitations of using only multiple-choice items. The purpose of this study was to explore and compare bootstrap standard errors of equating using equipercentile and IRT equating methods for polytomously-scored items under the common-item nonequivalent-groups design.;For this purpose, IRT and frequency estimation equipercentile equating methods were selected. The IRT equating methods varied by calibration methods (separate vs. concurrent calibration) and equating methods (true-score vs. observed-score equating). The equipercentile methods varied by smoothing methods (unsmoothing vs. postsmoothing). This study used a real data set from a writing assessment, and the original data was modified to create five-category data and three-category data. The standard errors of equating (SEEs) and the mean standard error of equating (MSEE) for the six equating methods were computed using a bootstrap method with two different sample sizes (n=1,680 and 1,000) for the five-category data and four different sample sizes (n=1,680, 1000, 500 and 250) for the three-category data.;The results of this study showed that, in general, the concurrent calibration method produced smaller MSEEs and SEEs than the separate calibration method, and the observed-score equating yielded smaller MSEEs and SEEs than the true-score equating. However, when the sample size was very small, such as 250, concurrent calibration and true-score equating produced smaller MSEEs and SEEs than separate calibration and observed-score equating. In addition, the equipercentile equating method produced smaller MSEEs and SEEs than the IRT equating method, and the smoothed equipercentile equating produced a smaller MSEE than the unsmoothed equipercentile method. Furthermore, for the three-category data, in terms of the increase of the MSEE based on the original sample size (IMO), the MSEEs for the equipercentile equating methods increased least when sample size decreased. But, in temas of the increase of the MSEE based on the adjacent sample size (IMA), the MSEEs for the IRT equating method using separate calibration increased least when sample size decreased.
Keywords/Search Tags:Equating, IRT, Using, Polytomously-scored items, Method, Equipercentile, Sample size, Standard errors
Related items