TY - JOUR T1 - A Blocked-CAT Procedure for CD-CAT JF - Applied Psychological Measurement Y1 - 2020 A1 - Mehmet Kaplan A1 - Jimmy de la Torre AB - This article introduces a blocked-design procedure for cognitive diagnosis computerized adaptive testing (CD-CAT), which allows examinees to review items and change their answers during test administration. Four blocking versions of the new procedure were proposed. In addition, the impact of several factors, namely, item quality, generating model, block size, and test length, on the classification rates was investigated. Three popular item selection indices in CD-CAT were used and their efficiency compared using the new procedure. An additional study was carried out to examine the potential benefit of item review. The results showed that the new procedure is promising in that allowing item review resulted only in a small loss in attribute classification accuracy under some conditions. Moreover, using a blocked-design CD-CAT is beneficial to the extent that it alleviates the negative impact of test anxiety on examinees’ true performance. VL - 44 UR - https://doi.org/10.1177/0146621619835500 ER - TY - CONF T1 - Bayesian Perspectives on Adaptive Testing T2 - IACAT 2017 Conference Y1 - 2017 A1 - Wim J. van der Linden A1 - Bingnan Jiang A1 - Hao Ren A1 - Seung W. Choi A1 - Qi Diao KW - Bayesian Perspective KW - CAT AB -

Although adaptive testing is usually treated from the perspective of maximum-likelihood parameter estimation and maximum-informaton item selection, a Bayesian pespective is more natural, statistically efficient, and computationally tractable. This observation not only holds for the core process of ability estimation but includes such processes as item calibration, and real-time monitoring of item security as well. Key elements of the approach are parametric modeling of each relevant process, updating of the parameter estimates after the arrival of each new response, and optimal design of the next step.

The purpose of the symposium is to illustrates the role of Bayesian statistics in this approach. The first presentation discusses a basic Bayesian algorithm for the sequential update of any parameter in adaptive testing and illustrates the idea of Bayesian optimal design for the two processes of ability estimation and online item calibration. The second presentation generalizes the ideas to the case of 62 IACAT 2017 ABSTRACTS BOOKLET adaptive testing with polytomous items. The third presentation uses the fundamental Bayesian idea of sampling from updated posterior predictive distributions (“multiple imputations”) to deal with the problem of scoring incomplete adaptive tests.

Session Video 1

Session Video 2

 

JF - IACAT 2017 Conference PB - Niigata Seiryo University CY - Niigata, Japan ER - TY - JOUR T1 - Bayesian Networks in Educational Assessment: The State of the Field JF - Applied Psychological Measurement Y1 - 2016 A1 - Culbertson, Michael J. AB - Bayesian networks (BN) provide a convenient and intuitive framework for specifying complex joint probability distributions and are thus well suited for modeling content domains of educational assessments at a diagnostic level. BN have been used extensively in the artificial intelligence community as student models for intelligent tutoring systems (ITS) but have received less attention among psychometricians. This critical review outlines the existing research on BN in educational assessment, providing an introduction to the ITS literature for the psychometric community, and points out several promising research paths. The online appendix lists 40 assessment systems that serve as empirical examples of the use of BN for educational assessment in a variety of domains. VL - 40 UR - http://apm.sagepub.com/content/40/1/3.abstract ER - TY - JOUR T1 - Best Design for Multidimensional Computerized Adaptive Testing With the Bifactor Model JF - Educational and Psychological Measurement Y1 - 2015 A1 - Seo, Dong Gi A1 - Weiss, David J. AB - Most computerized adaptive tests (CATs) have been studied using the framework of unidimensional item response theory. However, many psychological variables are multidimensional and might benefit from using a multidimensional approach to CATs. This study investigated the accuracy, fidelity, and efficiency of a fully multidimensional CAT algorithm (MCAT) with a bifactor model using simulated data. Four item selection methods in MCAT were examined for three bifactor pattern designs using two multidimensional item response theory models. To compare MCAT item selection and estimation methods, a fixed test length was used. The Ds-optimality item selection improved θ estimates with respect to a general factor, and either D- or A-optimality improved estimates of the group factors in three bifactor pattern designs under two multidimensional item response theory models. The MCAT model without a guessing parameter functioned better than the MCAT model with a guessing parameter. The MAP (maximum a posteriori) estimation method provided more accurate θ estimates than the EAP (expected a posteriori) method under most conditions, and MAP showed lower observed standard errors than EAP under most conditions, except for a general factor condition using Ds-optimality item selection. VL - 75 UR - http://epm.sagepub.com/content/75/6/954.abstract ER - TY - JOUR T1 - Balancing Flexible Constraints and Measurement Precision in Computerized Adaptive Testing JF - Educational and Psychological Measurement Y1 - 2012 A1 - Moyer, Eric L. A1 - Galindo, Jennifer L. A1 - Dodd, Barbara G. AB -

Managing test specifications—both multiple nonstatistical constraints and flexibly defined constraints—has become an important part of designing item selection procedures for computerized adaptive tests (CATs) in achievement testing. This study compared the effectiveness of three procedures: constrained CAT, flexible modified constrained CAT, and the weighted penalty model in balancing multiple flexible constraints and maximizing measurement precision in a fixed-length CAT. The study also addressed the effect of two different test lengths—25 items and 50 items—and of including or excluding the randomesque item exposure control procedure with the three methods, all of which were found effective in selecting items that met flexible test constraints when used in the item selection process for longer tests. When the randomesque method was included to control for item exposure, the weighted penalty model and the flexible modified constrained CAT models performed better than did the constrained CAT procedure in maintaining measurement precision. When no item exposure control method was used in the item selection process, no practical difference was found in the measurement precision of each balancing method.

VL - 72 UR - http://epm.sagepub.com/content/72/4/629.abstract ER - TY - JOUR T1 - Better Data From Better Measurements Using Computerized Adaptive Testing JF - Journal of Methods and Measurement in the Social Sciences Y1 - 2011 A1 - Weiss, D. J. AB - The process of constructing a fixed-length conventional test frequently focuses on maximizing internal consistency reliability by selecting test items that are of average difficulty and high discrimination (a ―peaked‖ test). The effect of constructing such a test, when viewed from the perspective of item response theory, is test scores that are precise for examinees whose trait levels are near the point at which the test is peaked; as examinee trait levels deviate from the mean, the precision of their scores decreases substantially. Results of a small simulation study demonstrate that when peaked tests are ―off target‖ for an examinee, their scores are biased and have spuriously high standard deviations, reflecting substantial amounts of error. These errors can reduce the correlations of these kinds of scores with other variables and adversely affect the results of standard statistical tests. By contrast, scores from adaptive tests are essentially unbiased and have standard deviations that are much closer to true values. Basic concepts of adaptive testing are introduced and fully adaptive computerized tests (CATs) based on IRT are described. Several examples of response records from CATs are discussed to illustrate how CATs function. Some operational issues, including item exposure, content balancing, and enemy items are also briefly discussed. It is concluded that because CAT constructs a unique test for examinee, scores from CATs will be more precise and should provide better data for social science research and applications. VL - Vol. 2 IS - No. 1 ER - TY - CONF T1 - Building Affordable CD-CAT Systems for Schools To Address Today's Challenges In Assessment T2 - Annual Conference of the International Association for Computerized Adaptive Testing Y1 - 2011 A1 - Chang, Hua-Hua KW - affordability KW - CAT KW - cost JF - Annual Conference of the International Association for Computerized Adaptive Testing ER - TY - JOUR T1 - Bayesian item selection in constrained adaptive testing JF - Psicologica Y1 - 2010 A1 - Veldkamp, B. P. KW - computerized adaptive testing AB - Application of Bayesian item selection criteria in computerized adaptive testing might result in improvement of bias and MSE of the ability estimates. The question remains how to apply Bayesian item selection criteria in the context of constrained adaptive testing, where large numbers of specifications have to be taken into account in the item selection process. The Shadow Test Approach is a general purpose algorithm for administering constrained CAT. In this paper it is shown how the approach can be slightly modified to handle Bayesian item selection criteria. No differences in performance were found between the shadow test approach and the modifiedapproach. In a simulation study of the LSAT, the effects of Bayesian item selection criteria are illustrated. The results are compared to item selection based on Fisher Information. General recommendations about the use of Bayesian item selection criteria are provided. VL - 31 ER - TY - CHAP T1 - A burdened CAT: Incorporating response burden with maximum Fisher's information for item selection Y1 - 2009 A1 - Swartz, R.J.. A1 - Choi, S. W. AB - Widely used in various educational and vocational assessment applications, computerized adaptive testing (CAT) has recently begun to be used to measure patient-reported outcomes Although successful in reducing respondent burden, most current CAT algorithms do not formally consider it as part of the item selection process. This study used a loss function approach motivated by decision theory to develop an item selection method that incorporates respondent burden into the item selection process based on maximum Fisher information item selection. Several different loss functions placing varying degrees of importance on respondent burden were compared, using an item bank of 62 polytomous items measuring depressive symptoms. One dataset consisted of the real responses from the 730 subjects who responded to all the items. A second dataset consisted of simulated responses to all the items based on a grid of latent trait scores with replicates at each grid point. The algorithm enables a CAT administrator to more efficiently control the respondent burden without severely affecting the measurement precision than when using MFI alone. In particular, the loss function incorporating respondent burden protected respondents from receiving longer tests when their estimated trait score fell in a region where there were few informative items. CY - In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF File, 374 KB} ER - TY - JOUR T1 - Binary items and beyond: a simulation of computer adaptive testing using the Rasch partial credit model JF - Journal of Applied Measurement Y1 - 2008 A1 - Lange, R. KW - *Data Interpretation, Statistical KW - *User-Computer Interface KW - Educational Measurement/*statistics & numerical data KW - Humans KW - Illinois KW - Models, Statistical AB - Past research on Computer Adaptive Testing (CAT) has focused almost exclusively on the use of binary items and minimizing the number of items to be administrated. To address this situation, extensive computer simulations were performed using partial credit items with two, three, four, and five response categories. Other variables manipulated include the number of available items, the number of respondents used to calibrate the items, and various manipulations of respondents' true locations. Three item selection strategies were used, and the theoretically optimal Maximum Information method was compared to random item selection and Bayesian Maximum Falsification approaches. The Rasch partial credit model proved to be quite robust to various imperfections, and systematic distortions did occur mainly in the absence of sufficient numbers of items located near the trait or performance levels of interest. The findings further indicate that having small numbers of items is more problematic in practice than having small numbers of respondents to calibrate these items. Most importantly, increasing the number of response categories consistently improved CAT's efficiency as well as the general quality of the results. In fact, increasing the number of response categories proved to have a greater positive impact than did the choice of item selection method, as the Maximum Information approach performed only slightly better than the Maximum Falsification approach. Accordingly, issues related to the efficiency of item selection methods are far less important than is commonly suggested in the literature. However, being based on computer simulations only, the preceding presumes that actual respondents behave according to the Rasch model. CAT research could thus benefit from empirical studies aimed at determining whether, and if so, how, selection strategies impact performance. VL - 9 SN - 1529-7713 (Print)1529-7713 (Linking) N1 - Lange, RenseUnited StatesJournal of applied measurementJ Appl Meas. 2008;9(1):81-104. ER - TY - CHAP T1 - Bundle models for computerized adaptive testing in e-learning assessment Y1 - 2007 A1 - Scalise, K. A1 - Wilson, M. CY - D. J. Weiss (Ed.). Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF file, 426 KB} ER - TY - JOUR T1 - A Bayesian student model without hidden nodes and its comparison with item response theory JF - International Journal of Artificial Intelligence in Education Y1 - 2005 A1 - Desmarais, M. C. A1 - Pu, X. KW - Bayesian Student Model KW - computer adaptive testing KW - hidden nodes KW - Item Response Theory AB - The Bayesian framework offers a number of techniques for inferring an individual's knowledge state from evidence of mastery of concepts or skills. A typical application where such a technique can be useful is Computer Adaptive Testing (CAT). A Bayesian modeling scheme, POKS, is proposed and compared to the traditional Item Response Theory (IRT), which has been the prevalent CAT approach for the last three decades. POKS is based on the theory of knowledge spaces and constructs item-to-item graph structures without hidden nodes. It aims to offer an effective knowledge assessment method with an efficient algorithm for learning the graph structure from data. We review the different Bayesian approaches to modeling student ability assessment and discuss how POKS relates to them. The performance of POKS is compared to the IRT two parameter logistic model. Experimental results over a 34 item Unix test and a 160 item French language test show that both approaches can classify examinees as master or non-master effectively and efficiently, with relatively comparable performance. However, more significant differences are found in favor of POKS for a second task that consists in predicting individual question item outcome. Implications of these results for adaptive testing and student modeling are discussed, as well as the limitations and advantages of POKS, namely the issue of integrating concepts into its structure. (PsycINFO Database Record (c) 2007 APA, all rights reserved) PB - IOS Press: Netherlands VL - 15 SN - 1560-4292 (Print); 1560-4306 (Electronic) ER - TY - CHAP T1 - Bayesian checks on outlying response times in computerized adaptive testing Y1 - 2003 A1 - van der Linden, W. J. CY - H. Yanai, A. Okada, K. Shigemasu, Y. Kano, Y. and J. J. Meulman, (Eds.), New developments in psychometrics (pp. 215-222). New York: Springer-Verlag. ER - TY - JOUR T1 - A Bayesian method for the detection of item preknowledge in computerized adaptive testing JF - Applied Psychological Measurement Y1 - 2003 A1 - McLeod L. D., Lewis, C., A1 - Thissen, D. VL - 27 ER - TY - JOUR T1 - A Bayesian method for the detection of item preknowledge in computerized adaptive testing JF - Applied Psychological Measurement Y1 - 2003 A1 - McLeod, L. A1 - Lewis, C. A1 - Thissen, D. KW - Adaptive Testing KW - Cheating KW - Computer Assisted Testing KW - Individual Differences computerized adaptive testing KW - Item KW - Item Analysis (Statistical) KW - Mathematical Modeling KW - Response Theory AB - With the increased use of continuous testing in computerized adaptive testing, new concerns about test security have evolved, such as how to ensure that items in an item pool are safeguarded from theft. In this article, procedures to detect test takers using item preknowledge are explored. When test takers use item preknowledge, their item responses deviate from the underlying item response theory (IRT) model, and estimated abilities may be inflated. This deviation may be detected through the use of person-fit indices. A Bayesian posterior log odds ratio index is proposed for detecting the use of item preknowledge. In this approach to person fit, the estimated probability that each test taker has preknowledge of items is updated after each item response. These probabilities are based on the IRT parameters, a model specifying the probability that each item has been memorized, and the test taker's item responses. Simulations based on an operational computerized adaptive test (CAT) pool are used to demonstrate the use of the odds ratio index. (PsycINFO Database Record (c) 2005 APA ) VL - 27 ER - TY - JOUR T1 - A Bayesian random effects model for testlets JF - Psychometrika Y1 - 1999 A1 - Bradlow, E. T. A1 - Wainer, H., A1 - Wang, X VL - 64 ER - TY - JOUR T1 - Benefits from computerized adaptive testing as seen in simulation studies JF - European Journal of Psychological Assessment Y1 - 1999 A1 - Hornke, L. F. VL - 15 ER - TY - CONF T1 - A Bayesian approach to detection of item preknowledge in a CAT T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1998 A1 - McLeod, L. D. A1 - Lewis, C. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - San Diego CA ER - TY - JOUR T1 - Bayesian identification of outliers in computerized adaptive testing JF - Journal of the American Statistical Association Y1 - 1998 A1 - Bradlow, E. T. A1 - Weiss, R. E. A1 - Cho, M. AB - We consider the problem of identifying examinees with aberrant response patterns in a computerized adaptive test (CAT). The vec-tor of responses yi of person i from the CAT comprise a multivariate response vector. Multivariate observations may be outlying in manydi erent directions and we characterize speci c directions as corre- sponding to outliers with different interpretations. We develop a class of outlier statistics to identify different types of outliers based on a con-trol chart type methodology. The outlier methodology is adaptable to general longitudinal discrete data structures. We consider several procedures to judge how extreme a particular outlier is. Data from the National Council Licensure EXamination (NCLEX) motivates our development and is used to illustrate the results. VL - 93 ER - TY - JOUR T1 - Bayesian item selection criteria for adaptive testing JF - Psychometrika Y1 - 1998 A1 - van der Linden, W. J. VL - 63 ER - TY - CONF T1 - A Bayesian enhancement of Mantel Haenszel DIF analysis for computer adaptive tests T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1997 A1 - Zwick, R. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Chicago IL ER - TY - JOUR T1 - Bayesian item selection criteria for adaptive testing JF - Psychometrika Y1 - 1996 A1 - van der Linden, W. J. VL - 63 ER - TY - ABST T1 - Bayesian item selection criteria for adaptive testing (Research Report 96-01) Y1 - 1996 A1 - van der Linden, W. J. CY - Twente, The Netherlands: Department of Educational Measurement and Data Analysis ER - TY - CONF T1 - Building a statistical foundation for computerized adaptive testing T2 - Paper presented at the annual meeting of the Psychometric Society Y1 - 1996 A1 - Chang, Hua-Hua A1 - Ying, Z. JF - Paper presented at the annual meeting of the Psychometric Society CY - Banff, Alberta, Canada ER - TY - CONF T1 - A Bayesian computerized mastery model with multiple cut scores T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1995 A1 - Smith, R. L. A1 - Lewis, C. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - San Francisco CA ER - TY - CONF T1 - Bayesian item selection in adaptive testing T2 - Paper presented at the Annual Meeting of the Psychometric Society Y1 - 1995 A1 - van der Linden, W. J. JF - Paper presented at the Annual Meeting of the Psychometric Society CY - Minneapolis MN ER - TY - JOUR T1 - Building algebra testlets: A comparison of hierarchical and linear structures JF - Journal of Educational Measurement Y1 - 1991 A1 - Wainer, H., A1 - Lewis, C. A1 - Kaplan, B. A1 - Braswell, J. VL - 8 ER - TY - JOUR T1 - Bayesian adaptation during computer-based tests and computer-guided practice exercises JF - Journal of Educational Computing Research Y1 - 1989 A1 - Frick, T. W. VL - 5(1) ER - TY - JOUR T1 - Bias and Information of Bayesian Adaptive Testing JF - Applied Psychological Measurement Y1 - 1984 A1 - Weiss, D. J. A1 - J. R. McBride VL - 8 IS - 3 ER - TY - JOUR T1 - Bias and information of Bayesian adaptive testing JF - Applied Psychological Measurement Y1 - 1984 A1 - Weiss, D. J. A1 - J. R. McBride VL - 8 ER - TY - ABST T1 - Bias and information of Bayesian adaptive testing (Research Report 83-2) Y1 - 1983 A1 - Weiss, D. J. A1 - J. R. McBride CY - Minneapolis: University of Minnesota, Department of Psychology, Computerized Adaptive Testing Laboratory N1 - {PDF file, 1.066MB} ER - TY - ABST T1 - Bayesian sequential design and analysis of dichotomous experiments with special reference to mental testing Y1 - 1979 A1 - Owen, R. J. CY - Princeton NJ: Educational Testing Service ER - TY - JOUR T1 - Bayesian Tailored Testing and the Influence of Item Bank Characteristics JF - Applied Psychological Measurement Y1 - 1977 A1 - Jensema, C J VL - 1 IS - 1 ER - TY - JOUR T1 - Bayesian tailored testing and the influence of item bank characteristics JF - Applied Psychological Measurement Y1 - 1977 A1 - Jensema, C J VL - 1 ER - TY - CHAP T1 - A brief overview of adaptive testing Y1 - 1977 A1 - J. R. McBride CY - D. J. Weiss (Ed.), Applications of computerized testing (Research Report 77-1). Minneapolis: University of Minnesota, Department of Psychology, Psychometric Methods Program N1 - 28 MB} ER - TY - JOUR T1 - A broad-range tailored test of verbal ability JF - Applied Psychological Measurement Y1 - 1977 A1 - Lord, F. M., VL - 1 ER - TY - JOUR T1 - A Broad-Range Tailored Test of Verbal Ability JF - Applied Psychological Measurement Y1 - 1977 A1 - Lord, F M VL - 1 IS - 1 ER - TY - CHAP T1 - Bandwidth, fidelity, and adaptive tests Y1 - 1976 A1 - J. R. McBride CY - T. J. McConnell, Jr. (Ed.), CAT/C 2 1975: The second conference on computer-assisted test construction. Atlanta GA: Atlanta Public Schools. N1 - PDF file, 783 K ER - TY - CHAP T1 - Bayesian tailored testing and the influence of item bank characteristics Y1 - 1976 A1 - Jensema, C J CY - C. K. Clark (Ed.), Proceedings of the First Conference on Computerized Adaptive Testing (pp. 82-89). Washington DC: U.S. Government Printing Office. N1 - {PDF file, 370 KB} ER - TY - CHAP T1 - A broad range tailored test of verbal ability Y1 - 1976 A1 - Lord, F. M., CY - C. K. Clark (Ed.), Proceedings of the First Conference on Computerized Adaptive Testing (pp. 75-78). Washington DC: U.S. Government Printing Office. N1 - #LO75-01 {PDF file, 250 KB} ER - TY - ABST T1 - A basic test theory generalizable to tailored testing (Technical Report No 1) Y1 - 1975 A1 - Cliff, N. A. CY - Los Angeles CA: University of Southern California, Department of Psychology. ER - TY - JOUR T1 - A Bayesian sequential procedure for quantal response in the context of adaptive mental testing JF - Journal of the American Statistical Association Y1 - 1975 A1 - Owen, R. J. VL - 70 ER - TY - CONF T1 - Behavior of the maximum likelihood estimate in a simulated tailored testing situation T2 - Paper presented at the annual meeting of the Psychometric Society Y1 - 1975 A1 - Samejima, F. JF - Paper presented at the annual meeting of the Psychometric Society CY - Iowa City N1 - {PDF file, 698 KB} ER - TY - ABST T1 - Best test design and self-tailored testing (Research Memorandum No 19) Y1 - 1975 A1 - Wright, B. D. A1 - Douglas, G. A. CY - Chicago: University of Chicago, Department of Education, Statistical Laboratory. ER - TY - ABST T1 - A broad range test of verbal ability (RB-75-5) Y1 - 1975 A1 - Lord, F. M., CY - Princeton NJ: Educational Testing Service ER - TY - CONF T1 - A Bayesian approach in sequential testing T2 - American Educational Research Association Y1 - 1974 A1 - Hsu, T. A1 - Pingel, K. JF - American Educational Research Association CY - Chicago IL ER - TY - ABST T1 - A Bayesian approach to tailored testing (Research Report 69-92) Y1 - 1969 A1 - Owen, R. J. CY - Princeton NJ: Educational Testing Service ER - TY - ABST T1 - Bayesian methods in psychological testing (Research Bulletin RB-69-31) Y1 - 1969 A1 - Novick, M. R. CY - Princeton NJ: Educational Testing Service ER -