%0 Journal Article %J Journal of Educational Measurement %D 2016 %T Monitoring Items in Real Time to Enhance CAT Security %A Zhang, Jinming %A Li, Jie %X An IRT-based sequential procedure is developed to monitor items for enhancing test security. The procedure uses a series of statistical hypothesis tests to examine whether the statistical characteristics of each item under inspection have changed significantly during CAT administration. This procedure is compared with a previously developed CTT-based procedure through simulation studies. The results show that when the total number of examinees is fixed both procedures can control the rate of type I errors at any reasonable significance level by choosing an appropriate cutoff point and meanwhile maintain a low rate of type II errors. Further, the IRT-based method has a much lower type II error rate or more power than the CTT-based method when the number of compromised items is small (e.g., 5), which can be achieved if the IRT-based procedure can be applied in an active mode in the sense that flagged items can be replaced with new items. %B Journal of Educational Measurement %V 53 %P 131–151 %U http://dx.doi.org/10.1111/jedm.12104 %R 10.1111/jedm.12104 %0 Journal Article %J Applied Psychological Measurement %D 2014 %T A Sequential Procedure for Detecting Compromised Items in the Item Pool of a CAT System %A Zhang, Jinming %X

To maintain the validity of a continuous testing system, such as computerized adaptive testing (CAT), items should be monitored to ensure that the performance of test items has not gone through any significant changes during their lifetime in an item pool. In this article, the author developed a sequentially monitoring procedure based on a series of statistical hypothesis tests to examine whether the statistical characteristics of individual items have changed significantly during test administration. Simulation studies show that under the simulated setting, by choosing an appropriate cutoff point, the procedure can control the rate of Type I errors at any reasonable significance level and meanwhile, has a very low rate of Type II errors.

%B Applied Psychological Measurement %V 38 %P 87-104 %U http://apm.sagepub.com/content/38/2/87.abstract %R 10.1177/0146621613510062