01434nas a2200133 4500008003900000245010800039210006900147300001200216490000700228520097100235100001201206700001901218856006301237 2018 d00aA Comparison of Constraint Programming and Mixed-Integer Programming for Automated Test-Form Generation0 aComparison of Constraint Programming and MixedInteger Programmin a435-4560 v553 aAbstract The final step of the typical process of developing educational and psychological tests is to place the selected test items in a formatted form. The step involves the grouping and ordering of the items to meet a variety of formatting constraints. As this activity tends to be time-intensive, the use of mixed-integer programming (MIP) has been proposed to automate it. The goal of this article is to show how constraint programming (CP) can be used as an alternative to automate test-form generation problems with a large variety of formatting constraints, and how it compares with MIP-based form generation as for its models, solutions, and running times. Two empirical examples are presented: (i) automated generation of a computerized fixed-form; and (ii) automated generation of shadow tests for multistage testing. Both examples show that CP works well with feasible solutions and running times likely to be better than that for MIP-based applications.1 aLi, Jie1 aLinden, Wim, J uhttps://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.1218701483nas a2200157 4500008003900000245004600039210004600085300001200131490000700143520104600150100001901196700002601215700001201241700001901253856005301272 2016 d00aOptimal Reassembly of Shadow Tests in CAT0 aOptimal Reassembly of Shadow Tests in CAT a469-4850 v403 aEven in the age of abundant and fast computing resources, concurrency requirements for large-scale online testing programs still put an uninterrupted delivery of computer-adaptive tests at risk. In this study, to increase the concurrency for operational programs that use the shadow-test approach to adaptive testing, we explored various strategies aiming for reducing the number of reassembled shadow tests without compromising the measurement quality. Strategies requiring fixed intervals between reassemblies, a certain minimal change in the interim ability estimate since the last assembly before triggering a reassembly, and a hybrid of the two strategies yielded substantial reductions in the number of reassemblies without degradation in the measurement accuracy. The strategies effectively prevented unnecessary reassemblies due to adapting to the noise in the early test stages. They also highlighted the practicality of the shadow-test approach by minimizing the computational load involved in its use of mixed-integer programming.1 aChoi, Seung, W1 aMoellering, Karin, T.1 aLi, Jie1 aLinden, Wim, J uhttp://apm.sagepub.com/content/40/7/469.abstract01277nas a2200133 4500008003900000245006600039210006500105300001200170490000700182520086900189100001301058700001901071856005301090 2013 d00aIntegrating Test-Form Formatting Into Automated Test Assembly0 aIntegrating TestForm Formatting Into Automated Test Assembly a361-3740 v373 a
Automated test assembly uses the methodology of mixed integer programming to select an optimal set of items from an item bank. Automated test-form generation uses the same methodology to optimally order the items and format the test form. From an optimization point of view, production of fully formatted test forms directly from the item pool using a simultaneous optimization model is more attractive than any of the current, more time-consuming two-stage processes. The goal of this study was to provide such simultaneous models both for computer-delivered and paper forms, as well as explore their performances relative to two-stage optimization. Empirical examples are presented to show that it is possible to automatically produce fully formatted optimal test forms directly from item pools up to some 2,000 items on a regular PC in realistic times.
1 aDiao, Qi1 aLinden, Wim, J uhttp://apm.sagepub.com/content/37/5/361.abstract01189nas a2200133 4500008003900000245003700039210003700076300001200113490000700125520082900132100001900961700001800980856005700998 2013 d00aSpeededness and Adaptive Testing0 aSpeededness and Adaptive Testing a418-4380 v383 aTwo simple constraints on the item parameters in a response–time model are proposed to control the speededness of an adaptive test. As the constraints are additive, they can easily be included in the constraint set for a shadow-test approach (STA) to adaptive testing. Alternatively, a simple heuristic is presented to control speededness in plain adaptive testing without any constraints. Both types of control are easy to implement and do not require any other real-time parameter estimation during the test than the regular update of the test taker’s ability estimate. Evaluation of the two approaches using simulated adaptive testing showed that the STA was especially effective. It guaranteed testing times that differed less than 10 seconds from a reference test across a variety of conditions.
1 aLinden, Wim, J1 aXiong, Xinhui uhttp://jeb.sagepub.com/cgi/content/abstract/38/4/41801013nas a2200121 4500008003900000245005800039210005800097300001000155490000700165520064800172100001900820856005200839 2009 d00aPredictive Control of Speededness in Adaptive Testing0 aPredictive Control of Speededness in Adaptive Testing a25-410 v333 aAn adaptive testing method is presented that controls the speededness of a test using predictions of the test takers' response times on the candidate items in the pool. Two different types of predictions are investigated: posterior predictions given the actual response times on the items already administered and posterior predictions that use the responses on these items as an additional source of information. In a simulation study with an adaptive test modeled after a test from the Armed Services Vocational Aptitude Battery, the effectiveness of the methods in removing differential speededness from the test was evaluated.
1 aLinden, Wim, J uhttp://apm.sagepub.com/content/33/1/25.abstract00469nas a2200121 4500008003900000245011200039210006900151300001200220490000600232100002500238700001900263856006500282 2008 d00aImplementing Sympson-Hetter Item-Exposure Control in a Shadow-Test Approach to Constrained Adaptive Testing0 aImplementing SympsonHetter ItemExposure Control in a ShadowTest a272-2890 v81 aVeldkamp, Bernard, P1 aLinden, Wim, J uhttp://www.tandfonline.com/doi/abs/10.1080/1530505080226223301361nas a2200133 4500008003900000245009700039210006900136300001200205490000700217520090200224100001901126700002501145856005701170 2007 d00aConditional Item-Exposure Control in Adaptive Testing Using Item-Ineligibility Probabilities0 aConditional ItemExposure Control in Adaptive Testing Using ItemI a398-4180 v323 aTwo conditional versions of the exposure-control method with item-ineligibility constraints for adaptive testing in van der Linden and Veldkamp (2004) are presented. The first version is for unconstrained item selection, the second for item selection with content constraints imposed by the shadow-test approach. In both versions, the exposure rates of the items are controlled using probabilities of item ineligibility given θ that adapt the exposure rates automatically to a goal value for the items in the pool. In an extensive empirical study with an adaptive version of the Law School Admission Test, the authors show how the method can be used to drive conditional exposure rates below goal values as low as 0.025. Obviously, the price to be paid for minimal exposure rates is a decrease in the accuracy of the ability estimates. This trend is illustrated with empirical data.
1 aLinden, Wim, J1 aVeldkamp, Bernard, P uhttp://jeb.sagepub.com/cgi/content/abstract/32/4/39801549nas a2200169 4500008003900000022001400039245006100053210006100114300001400175490000700189520104600196100001901242700002301261700002201284700001801306856005501324 2007 d a1745-398400aDetecting Differential Speededness in Multistage Testing0 aDetecting Differential Speededness in Multistage Testing a117–1300 v443 aA potential undesirable effect of multistage testing is differential speededness, which happens if some of the test takers run out of time because they receive subtests with items that are more time intensive than others. This article shows how a probabilistic response-time model can be used for estimating differences in time intensities and speed between subtests and test takers and detecting differential speededness. An empirical data set for a multistage test in the computerized CPA Exam was used to demonstrate the procedures. Although the more difficult subtests appeared to have items that were more time intensive than the easier subtests, an analysis of the residual response times did not reveal any significant differential speededness because the time limit appeared to be appropriate. In a separate analysis, within each of the subtests, we found minor but consistent patterns of residual times that are believed to be due to a warm-up effect, that is, use of more time on the initial items than they actually need.
1 aLinden, Wim, J1 aBreithaupt, Krista1 aChuah, Siang Chee1 aZhang, Yanwei uhttp://dx.doi.org/10.1111/j.1745-3984.2007.00030.x01302nas a2200133 4500008003900000245008200039210006900121300001200190490000700202520085800209100001901067700002501086856005701111 2004 d00aConstraining Item Exposure in Computerized Adaptive Testing With Shadow Tests0 aConstraining Item Exposure in Computerized Adaptive Testing With a273-2910 v293 aItem-exposure control in computerized adaptive testing is implemented by imposing item-ineligibility constraints on the assembly process of the shadow tests. The method resembles Sympson and Hetter’s (1985) method of item-exposure control in that the decisions to impose the constraints are probabilistic. The method does not, however, require time-consuming simulation studies to set values for control parameters before the operational use of the test. Instead, it can set the probabilities of item ineligibility adaptively during the test using the actual item-exposure rates. An empirical study using an item pool from the Law School Admission Test showed that application of the method yielded perfect control of the item-exposure rates and had negligible impact on the bias and mean-squared error functions of the ability estimator.
1 aLinden, Wim, J1 aVeldkamp, Bernard, P uhttp://jeb.sagepub.com/cgi/content/abstract/29/3/273