A mixed -methods approach to test evaluation using explanatory item response modeling and think -alouds | | Posted on:2009-08-28 | Degree:Ph.D | Type:Dissertation | | University:University of California, Berkeley | Candidate:Tan, Rachael Jin Bee | Full Text:PDF | | GTID:1448390002493727 | Subject:Education | | Abstract/Summary: | PDF Full Text Request | | Usually we think of assessments as simply tools used to evaluate student learning, but assessments should be the subject of evaluation as well. One method of test evaluation is Explanatory Item Response Modeling (EIRM), which utilizes information about students and/or items to ascertain which of their characteristics (i.e., of both persons and items) influence test performance. This research applies EIRM to an elementary school science assessment by including student and item characteristics and their interaction effects in a latent regression Linear Logistic Test Model (LLTM), thereby investigating differential performances among students at the test level. Going down to a finer grain-size---the item level---Differential Item Functioning (DIF) and Differential Facet Functioning1 (DFF) are also investigated, and "think-alouds" 2 are used to lend insight into results from these models. As a test designer it is important to consider the results of EIRM analysis in conjunction with think-alouds as a step toward ensuring that assessments are not positively or negatively biased against any group of students.;Quantitative results revealed that the two different item formats---multiple-choice and constructed-response---displayed instances of DFF while the other item properties, mainly related to item content, did not. Constructed-response items displayed the most instances of DFF, with students of low- and medium-English language proficiency performing less well on this item type than native English speakers of matched ability, and Black and Latino students performing less well than White students of matched ability. Black students also performed more poorly on multiple-choice items than White students of matched ability. Think-alouds were conducted with Black and White students on multiple-choice items, and comparisons of the coded qualitative data to the quantitative results provided some limited insights into why DFF was found in favor of White students compared to Black students; however, comparisons of the data were helpful in indicating how to improve items and perhaps even instruction. Using quantitative and qualitative methods in conjunction not only helped substantiate instances of differential test performance by students, but also strengthened the evaluation by utilizing methods with different biases, providing stronger evidence for conclusions drawn when examining the intersection of their results.;1Note that DFF is differential functioning among subsets of items that share common characteristics. 2Think-alouds are investigations of student thinking while they are answering items. | | Keywords/Search Tags: | Item, Test, Student, Evaluation, DFF | PDF Full Text Request | Related items |
| |
|