|Title||An Imputation Approach to Handling Incomplete Computerized Tests|
|Publication Type||Conference Paper|
|Year of Publication||2017|
|Authors||Chen, T, Huang, C-Y, Liu, C|
|Conference Name||IACAT 2017 Conference|
|Publisher||Niigata Seiryo University|
|Conference Location||Niigata, Japan|
|Keywords||CAT, imputation approach, incomplete computerized test|
As technology advances, computerized adaptive testing (CAT) is becoming increasingly popular as it allows tests to be tailored to an examinee’s ability. Nevertheless, examinees might devise testing strategies to use CAT to their advantage. For instance, if only the items that examinees answer count towards their score, then a higher theta score might be obtained by spending more time on items at the beginning of the test and skipping items at the end if time runs out. This type of gaming can be discouraged if examinees’ scores are lowered or “penalized” based on the amount of non-response.
The goal of this study was to devise a penalty function that would meet two criteria: 1) the greater the omit rate, the greater the penalty, and 2) examinees with the same ability and the same omit rate should receive the same penalty. To create the penalty, theta was calculated based on only the items the examinee responded to ( ). Next, the expected number correct score (EXR) was obtained using and the test characteristic curve. A penalized expected number correct score (E ) was obtained by multiplying EXR by the proportion of items the examinee responded to. Finally, the penalized theta ( ) was identified using the test characteristic curve. Based on and the item parameters ( ) of an unanswered item, the likelihood of a correct response, , is computed and employed to estimate the imputed score ( ) for the unanswered item.
Two datasets were used to generate tests with completion rates of 50%, 80%, and 90%. The first dataset included real data where approximately 4,500 examinees responded to a 21 -item test which provided a baseline/truth. Sampling was done to achieve the three completion rate conditions. The second dataset consisted of simulated item scores for 50,000 simulees under a 1-2-4 multi-stage CAT design where each stage contained seven items. Imputed item scores for unanswered items were computed using a variety of values for G (and therefore T). Three other approaches to handling unanswered items were also considered: all correct (i.e., T = 0), all incorrect (i.e., T = 1), and random scoring (i.e., T = 0.5).
The current study investigated the impact on theta estimates resulting from the proposed approach to handling unanswered items in a fixed-length CAT. In real testing situations, when examinees do not finish a test, it is hard to tell whether they tried diligently but ran out of time or whether they attempted to manipulate the scoring engine. To handle unfinished tests with penalties, the proposed approach considers examinees’ abilities and incompletion rates. The results of this study provide direction for psychometric practitioners when considering penalties for omitted responses.