01395nas a2200181 4500008004500000022001400045245008700059210006900146490000700215520080200222653003401024653002501058653001901083653001901102653001601121100002101137856005501158 2024 Engldsh a2165-659200aThe Influence of Computerized Adaptive Testing on Psychometric Theory and Practice0 aInfluence of Computerized Adaptive Testing on Psychometric Theor0 v113 a
The major premise of this article is that part of the stimulus for the evolution of psychometric theory since the 1950s was the introduction of the concept of computerized adaptive testing (CAT) or its earlier non-CAT variations. The conceptual underpinnings of CAT that had the most influence on psychometric theory was the shift of emphasis from the test (or test score) as the focus of analysis to the test item (or item score). The change in focus allowed a change in the way that test results are conceived of as measurements. It also resolved the conflict among a number of ideas that were present in the early work on psychometric theory. Some of the conflicting ideas are summarized below to show how work on the development of CAT resolved some of those conflicts.
10acomputerized adaptive testing10aItem Response Theory10aparadigm shift10ascaling theory10atest design1 aReckase, Mark, D uhttps://jcatpub.net/index.php/jcat/issue/view/34/900728nas a2200193 4500008004500000245009100045210006900136300001000205490000700215653003500222653003400257653002900291653002600320100001900346700002700365700002400392700002100416856009700437 2023 Engldsh 00aHow Do Trait Change Patterns Affect the Performance of Adaptive Measurement of Change?0 aHow Do Trait Change Patterns Affect the Performance of Adaptive a32-580 v1010aadaptive measurement of change10acomputerized adaptive testing10alongitudinal measurement10atrait change patterns1 aTai, Ming, Him1 aCooperman, Allison, W.1 aDeWeese, Joseph, N.1 aWeiss, David, J. uhttp://iacat.org/how-do-trait-change-patterns-affect-performance-adaptive-measurement-change00480nas a2200133 4500008003900000245007300039210006700112490000600179653003400185653001300219653003500232100002600267856005300293 2022 d00aThe (non)Impact of Misfitting Items in Computerized Adaptive Testing0 anonImpact of Misfitting Items in Computerized Adaptive Testing0 v910acomputerized adaptive testing10aitem fit10athree-parameter logistic model1 aDeMars, Christine, E. uhttps://jcatpub.net/index.php/jcat/issue/view/2601609nas a2200205 4500008004500000022001400045245005000059210004900109300001000158490000600168520099300174653003501167653003401202653002301236653001901259653002701278100002301305700001501328856006001343 2019 Engldsh a2165-659200aTime-Efficient Adaptive Measurement of Change0 aTimeEfficient Adaptive Measurement of Change a15-340 v73 aThe adaptive measurement of change (AMC) refers to the use of computerized adaptive testing (CAT) at multiple occasions to efficiently assess a respondent’s improvement, decline, or sameness from occasion to occasion. Whereas previous AMC research focused on administering the most informative item to a respondent at each stage of testing, the current research proposes the use of Fisher information per time unit as an item selection procedure for AMC. The latter procedure incorporates not only the amount of information provided by a given item but also the expected amount of time required to complete it. In a simulation study, the use of Fisher information per time unit item selection resulted in a lower false positive rate in the majority of conditions studied, and a higher true positive rate in all conditions studied, compared to item selection via Fisher information without accounting for the expected time taken. Future directions of research are suggested.
10aadaptive measurement of change10acomputerized adaptive testing10aFisher information10aitem selection10aresponse-time modeling1 aFinkelman, Matthew1 aWang, Chun uhttp://iacat.org/jcat/index.php/jcat/article/view/73/3501161nas a2200157 4500008004100000245005700041210005600098520065200154653002100806653003400827653001500861653002500876100001300901700001500914856007400929 2011 eng d00acatR: An R Package for Computerized Adaptive Testing0 acatR An R Package for Computerized Adaptive Testing3 aComputerized adaptive testing (CAT) is an active current research field in psychometrics and educational measurement. However, there is very little software available to handle such adaptive tasks. The R package catR was developed to perform adaptive testing with as much flexibility as possible, in an attempt to provide a developmental and testing platform to the interested user. Several item-selection rules and ability estimators are implemented. The item bank can be provided by the user or randomly generated from parent distributions of item parameters. Three stopping rules are available. The output can be graphically displayed.
10acomputer program10acomputerized adaptive testing10aEstimation10aItem Response Theory1 aMagis, D1 aRaîche, G uhttp://iacat.org/content/catr-r-package-computerized-adaptive-testing01380nas a2200133 4500008004100000245006000041210006000101300001200161490000700173520093200180653003401112100001801146856008201164 2010 eng d00aBayesian item selection in constrained adaptive testing0 aBayesian item selection in constrained adaptive testing a149-1690 v313 aApplication of Bayesian item selection criteria in computerized adaptive testing might result in improvement of bias and MSE of the ability estimates. The question remains how to apply Bayesian item selection criteria in the context of constrained adaptive testing, where large numbers of specifications have to be taken into account in the item selection process. The Shadow Test Approach is a general purpose algorithm for administering constrained CAT. In this paper it is shown how the approach can be slightly modified to handle Bayesian item selection criteria. No differences in performance were found between the shadow test approach and the modifiedapproach. In a simulation study of the LSAT, the effects of Bayesian item selection criteria are illustrated. The results are compared to item selection based on Fisher Information. General recommendations about the use of Bayesian item selection criteria are provided.10acomputerized adaptive testing1 aVeldkamp, B P uhttp://iacat.org/content/bayesian-item-selection-constrained-adaptive-testing01345nas a2200229 4500008004100000020001300041245011700054210006900171300001200240490000700252520057500259653000800834653003400842653001900876653001500895100002000910700001600930700001800946700001500964700001400979856012200993 2010 eng d a0191886900aDetection of aberrant item score patterns in computerized adaptive testing: An empirical example using the CUSUM0 aDetection of aberrant item score patterns in computerized adapti a921-9250 v483 aThe scalability of individual trait scores on a computerized adaptive test (CAT) was assessed through investigating the consistency of individual item score patterns. A sample of N = 428 persons completed a personality CAT as part of a career development procedure. To detect inconsistent item score patterns, we used a cumulative sum (CUSUM) procedure. Combined information from the CUSUM, other personality measures, and interviews showed that similar estimated trait values may have a different interpretation.Implications for computer-based assessment are discussed.10aCAT10acomputerized adaptive testing10aCUSUM approach10aperson Fit1 aEgberink, I J L1 aMeijer, R R1 aVeldkamp, B P1 aSchakel, L1 aSmid, N G uhttp://iacat.org/content/detection-aberrant-item-score-patterns-computerized-adaptive-testing-empirical-example-using01507nas a2200217 4500008004100000245008100041210006900122300001200191490000700203520078900210653001100999653003401010653002201044653003501066653002101101653002101122100001901143700001401162700001601176856009701192 2010 eng d00aItem Selection and Hypothesis Testing for the Adaptive Measurement of Change0 aItem Selection and Hypothesis Testing for the Adaptive Measureme a238-2540 v343 aAssessing individual change is an important topic in both psychological and educational measurement. An adaptive measurement of change (AMC) method had previously been shown to exhibit greater efficiency in detecting change than conventional nonadaptive methods. However, little work had been done to compare different procedures within the AMC framework. This study introduced a new item selection criterion and two new test statistics for detecting change with AMC that were specifically designed for the paradigm of hypothesis testing. In two simulation sets, the new methods for detecting significant change improved on existing procedures by demonstrating better adherence to Type I error rates and substantially better power for detecting relatively small change.
10achange10acomputerized adaptive testing10aindividual change10aKullback–Leibler information10alikelihood ratio10ameasuring change1 aFinkelman, M D1 aWeiss, DJ1 aKim-Kang, G uhttp://iacat.org/content/item-selection-and-hypothesis-testing-adaptive-measurement-change-001724nas a2200181 4500008004100000245011200041210006900153300001100222490000700233520107200240653003401312653001701346653001901363100001601382700001801398700001501416856011101431 2008 eng d00aThe D-optimality item selection criterion in the early stage of CAT: A study with the graded response model0 aDoptimality item selection criterion in the early stage of CAT A a88-1100 v333 aDuring the early stage of computerized adaptive testing (CAT), item selection criteria based on Fisher’s information often produce less stable latent trait estimates than the Kullback-Leibler global information criterion. Robustness against early stage instability has been reported for the D-optimality criterion in a polytomous CAT with the Nominal Response Model and is shown herein to be reproducible for the Graded Response Model. For comparative purposes, the A-optimality and the global information criteria are also applied. Their item selection is investigated as a function of test progression and item bank composition. The results indicate how the selection of specific item parameters underlies the criteria performances evaluated via accuracy and precision of estimation. In addition, the criteria item exposure rates are compared, without the use of any exposure controlling measure. On the account of stability, precision, accuracy, numerical simplicity, and less evidently, item exposure rate, the D-optimality criterion can be recommended for CAT.10acomputerized adaptive testing10aD optimality10aitem selection1 aPassos, V L1 aBerger, M P F1 aTan, F E S uhttp://iacat.org/content/d-optimality-item-selection-criterion-early-stage-cat-study-graded-response-model02190nas a2200145 4500008004100000245009900041210006900140300001000209490000800219520163900227653003401866100001901900700001601919856010901935 2008 eng d00aICAT: An adaptive testing procedure for the identification of idiosyncratic knowledge patterns0 aICAT An adaptive testing procedure for the identification of idi a40-480 v2163 aTraditional adaptive tests provide an efficient method for estimating student achievements levels, by adjusting the characteristicsof the test questions to match the performance of each student. These traditional adaptive tests are not designed to identify diosyncraticknowledge patterns. As students move through their education, they learn content in any number of different ways related to their learning style and cognitive development. This may result in a student having different achievement levels from one content area to another within a domain of content. This study investigates whether such idiosyncratic knowledge patterns exist. It discusses the differences between idiosyncratic knowledge patterns and multidimensionality. Finally, it proposes an adaptive testing procedure that can be used to identify a student’s areas of strength and weakness more efficiently than current adaptive testing approaches. The findings of the study indicate that a fairly large number of students may have test results that are influenced by their idiosyncratic knowledge patterns. The findings suggest that these patterns persist across time for a large number of students, and that the differences in student performance between content areas within a subject domain are large enough to allow them to be useful in instruction. Given the existence of idiosyncratic patterns of knowledge, the proposed testing procedure may enable us to provide more useful information to teachers. It should also allow us to differentiate between idiosyncratic patterns or knowledge, and important mutidimensionality in the testing data.
10acomputerized adaptive testing1 aKingsbury, G G1 aHouser, R L uhttp://iacat.org/content/icat-adaptive-testing-procedure-identification-idiosyncratic-knowledge-patterns01289nas a2200133 4500008004100000245005700041210005700098300000900155490000800164520084700172653003401019100002301053856007901076 2008 eng d00aSome new developments in adaptive testing technology0 aSome new developments in adaptive testing technology a3-110 v2163 aIn an ironic twist of history, modern psychological testing has returned to an adaptive format quite common when testing was not yet standardized. Important stimuli to the renewed interest in adaptive testing have been the development of item-response theory in psychometrics, which models the responses on test items using separate parameters for the items and test takers, and the use of computers in test administration, which enables us to estimate the parameter for a test taker and select the items in real time. This article reviews a selection from the latest developments in the technology of adaptive testing, such as constrained adaptive item selection, adaptive testing using rule-based item generation, multidimensional adaptive testing, adaptive use of test batteries, and the use of response times in adaptive testing.
10acomputerized adaptive testing1 avan der Linden, WJ uhttp://iacat.org/content/some-new-developments-adaptive-testing-technology00500nas a2200121 4500008004100000245006600041210006600107260005700173653003400230100001800264700001000282856008600292 2007 eng d00aComputerized classification testing with composite hypotheses0 aComputerized classification testing with composite hypotheses aSt. Paul, MNbGraduate Management Admissions Council10acomputerized adaptive testing1 aThompson, N A1 aRo, S uhttp://iacat.org/content/computerized-classification-testing-composite-hypotheses02208nas a2200229 4500008004100000020004600041245008500087210006900172260004500241300001000286490000600296520140300302653003401705653002301739653002601762653001701788653002601805100001701831700001201848700001501860856010301875 2007 eng d a1614-1881 (Print); 1614-2241 (Electronic)00aMethods for restricting maximum exposure rate in computerized adaptative testing0 aMethods for restricting maximum exposure rate in computerized ad bHogrefe & Huber Publishers GmbH: Germany a14-230 v33 aThe Sympson-Hetter (1985) method provides a means of controlling maximum exposure rate of items in Computerized Adaptive Testing. Through a series of simulations, control parameters are set that mark the probability of administration of an item on being selected. This method presents two main problems: it requires a long computation time for calculating the parameters and the maximum exposure rate is slightly above the fixed limit. Van der Linden (2003) presented two alternatives which appear to solve both of the problems. The impact of these methods in the measurement accuracy has not been tested yet. We show how these methods over-restrict the exposure of some highly discriminating items and, thus, the accuracy is decreased. It also shown that, when the desired maximum exposure rate is near the minimum possible value, these methods offer an empirical maximum exposure rate clearly above the goal. A new method, based on the initial estimation of the probability of administration and the probability of selection of the items with the restricted method (Revuelta & Ponsoda, 1998), is presented in this paper. It can be used with the Sympson-Hetter method and with the two van der Linden's methods. This option, when used with Sympson-Hetter, speeds the convergence of the control parameters without decreasing the accuracy. (PsycINFO Database Record (c) 2007 APA, all rights reserved)10acomputerized adaptive testing10aitem bank security10aitem exposure control10aoverlap rate10aSympson-Hetter method1 aBarrada, J R1 aOlea, J1 aPonsoda, V uhttp://iacat.org/content/methods-restricting-maximum-exposure-rate-computerized-adaptative-testing01453nas a2200181 4500008004100000245008200041210006900123260001300192490000800205520080800213653000801021653001901029653003001048653003401078653004001112100001801152856010101170 2007 eng d00aA practitioner's guide to variable-length computerized classification testing0 apractitioners guide to variablelength computerized classificatio c7/1/20090 v12 3 aVariable-length computerized classification tests, CCTs, (Lin & Spray, 2000; Thompson, 2006) are a powerful and efficient approach to testing for the purpose of classifying examinees into groups. CCTs are designed by the specification of at least five technical components: psychometric model, calibrated item bank, starting point, item selection algorithm, and termination criterion. Several options exist for each of these CCT components, creating a myriad of possible designs. Confusion among designs is exacerbated by the lack of a standardized nomenclature. This article outlines the components of a CCT, common options for each component, and the interaction of options for different components, so that practitioners may more efficiently design CCTs. It also offers a suggestion of nomenclature. 10aCAT10aclassification10acomputer adaptive testing10acomputerized adaptive testing10aComputerized classification testing1 aThompson, N A uhttp://iacat.org/content/practitioners-guide-variable-length-computerized-classification-testing01684nas a2200217 4500008004100000020002200041245008000063210006900143260002600212300001200238490000700250520095600257653003401213653002701247653002301274653002901297100001601326700001801342700001301360856009301373 2007 eng d a0146-6216 (Print)00aTest design optimization in CAT early stage with the nominal response model0 aTest design optimization in CAT early stage with the nominal res bSage Publications: US a213-2320 v313 aThe early stage of computerized adaptive testing (CAT) refers to the phase of the trait estimation during the administration of only a few items. This phase can be characterized by bias and instability of estimation. In this study, an item selection criterion is introduced in an attempt to lessen this instability: the D-optimality criterion. A polytomous unconstrained CAT simulation is carried out to evaluate this criterion's performance under different test premises. The simulation shows that the extent of early stage instability depends primarily on the quality of the item pool information and its size and secondarily on the item selection criteria. The efficiency of the D-optimality criterion is similar to the efficiency of other known item selection criteria. Yet, it often yields estimates that, at the beginning of CAT, display a more robust performance against instability. (PsycINFO Database Record (c) 2007 APA, all rights reserved)10acomputerized adaptive testing10anominal response model10arobust performance10atest design optimization1 aPassos, V L1 aBerger, M P F1 aTan, F E uhttp://iacat.org/content/test-design-optimization-cat-early-stage-nominal-response-model02172nas a2200181 4500008004100000020002200041245006200063210006200125260003800187300001200225490000700237520155900244653002901803653003401832653001801866100002201884856008401906 2006 eng d a0033-3018 (Print)00aAdaptive success control in computerized adaptive testing0 aAdaptive success control in computerized adaptive testing bPabst Science Publishers: Germany a436-4500 v483 aIn computerized adaptive testing (CAT) procedures within the framework of probabilistic test theory the difficulty of an item is adjusted to the ability of the respondent, with the aim of maximizing the amount of information generated per item, thereby also increasing test economy and test reasonableness. However, earlier research indicates that respondents might feel over-challenged by a constant success probability of p = 0.5 and therefore cannot come to a sufficiently high answer certainty within a reasonable timeframe. Consequently response time per item increases, which -- depending on the test material -- can outweigh the benefit of administering optimally informative items. Instead of a benefit, the result of using CAT procedures could be a loss of test economy. Based on this problem, an adaptive success control algorithm was designed and tested, adapting the success probability to the working style of the respondent. Persons who need higher answer certainty in order to come to a decision are detected and receive a higher success probability, in order to minimize the test duration (not the number of items as in classical CAT). The method is validated on the re-analysis of data from the Adaptive Matrices Test (AMT, Hornke, Etzel & Rettig, 1999) and by the comparison between an AMT version using classical CAT and an experimental version using Adaptive Success Control. The results are discussed in the light of psychometric and psychological aspects of test quality. (PsycINFO Database Record (c) 2007 APA, all rights reserved)10aadaptive success control10acomputerized adaptive testing10aPsychometrics1 aHäusler, Joachim uhttp://iacat.org/content/adaptive-success-control-computerized-adaptive-testing01959nas a2200265 4500008004100000020002200041245008200063210006900145260002600214300001000240490000700250520112900257653001501386653003401401653001401435653001701449653002401466653001501490653002201505653001501527100002301542700001301565700001801578856009701596 2006 eng d a1076-9986 (Print)00aAssembling a computerized adaptive testing item pool as a set of linear tests0 aAssembling a computerized adaptive testing item pool as a set of bSage Publications: US a81-990 v313 aTest-item writing efforts typically results in item pools with an undesirable correlational structure between the content attributes of the items and their statistical information. If such pools are used in computerized adaptive testing (CAT), the algorithm may be forced to select items with less than optimal information, that violate the content constraints, and/or have unfavorable exposure rates. Although at first sight somewhat counterintuitive, it is shown that if the CAT pool is assembled as a set of linear test forms, undesirable correlations can be broken down effectively. It is proposed to assemble such pools using a mixed integer programming model with constraints that guarantee that each test meets all content specifications and an objective function that requires them to have maximal information at a well-chosen set of ability values. An empirical example with a previous master pool from the Law School Admission Test (LSAT) yielded a CAT with nearly uniform bias and mean-squared error functions for the ability estimator and item-exposure rates that satisfied the target for all items in the pool. 10aAlgorithms10acomputerized adaptive testing10aitem pool10alinear tests10amathematical models10astatistics10aTest Construction10aTest Items1 avan der Linden, WJ1 aAriel, A1 aVeldkamp, B P uhttp://iacat.org/content/assembling-computerized-adaptive-testing-item-pool-set-linear-tests01943nas a2200241 4500008004100000020002200041245011200063210006900175260004100244300001200285490000700297520102300304653003401327653002401361653004701385653002401432653002101456653007001477100001301547700001401560700001001574856011701584 2006 eng d a0022-0655 (Print)00aComparing methods of assessing differential item functioning in a computerized adaptive testing environment0 aComparing methods of assessing differential item functioning in bBlackwell Publishing: United Kingdom a245-2640 v433 aMantel-Haenszel and SIBTEST, which have known difficulty in detecting non-unidirectional differential item functioning (DIF), have been adapted with some success for computerized adaptive testing (CAT). This study adapts logistic regression (LR) and the item-response-theory-likelihood-ratio test (IRT-LRT), capable of detecting both unidirectional and non-unidirectional DIF, to the CAT environment in which pretest items are assumed to be seeded in CATs but not used for trait estimation. The proposed adaptation methods were evaluated with simulated data under different sample size ratios and impact conditions in terms of Type I error, power, and specificity in identifying the form of DIF. The adapted LR and IRT-LRT procedures are more powerful than the CAT version of SIBTEST for non-unidirectional DIF detection. The good Type I error control provided by IRT-LRT under extremely unequal sample sizes and large impact is encouraging. Implications of these and other findings are discussed. all rights reserved)10acomputerized adaptive testing10aeducational testing10aitem response theory likelihood ratio test10alogistic regression10atrait estimation10aunidirectional & non-unidirectional differential item functioning1 aLei, P-W1 aChen, S-Y1 aYu, L uhttp://iacat.org/content/comparing-methods-assessing-differential-item-functioning-computerized-adaptive-testing02399nas a2200241 4500008004100000020002200041245008500063210006900148260002500217300001200242490000700254520161500261653000801876653003401884653002601918653003001944653002601974100001402000700001402014700001602028700001502044856009802059 2006 eng d a0439-755X (Print)00aThe comparison among item selection strategies of CAT with multiple-choice items0 acomparison among item selection strategies of CAT with multiplec bScience Press: China a778-7830 v383 aThe initial purpose of comparing item selection strategies for CAT was to increase the efficiency of tests. As studies continued, however, it was found that increasing the efficiency of item bank using was also an important goal of comparing item selection strategies. These two goals often conflicted. The key solution was to find a strategy with which both goals could be accomplished. The item selection strategies for graded response model in this study included: the average of the difficulty orders matching with the ability; the medium of the difficulty orders matching with the ability; maximum information; A stratified (average); and A stratified (medium). The evaluation indexes used for comparison included: the bias of ability estimates for the true; the standard error of ability estimates; the average items which the examinees have administered; the standard deviation of the frequency of items selected; and sum of the indices weighted. Using the Monte Carlo simulation method, we obtained some data and computer iterated the data 20 times each under the conditions that the item difficulty parameters followed the normal distribution and even distribution. The results were as follows; The results indicated that no matter difficulty parameters followed the normal distribution or even distribution. Every type of item selection strategies designed in this research had its strong and weak points. In general evaluation, under the condition that items were stratified appropriately, A stratified (medium) (ASM) had the best effect. (PsycINFO Database Record (c) 2007 APA, all rights reserved)10aCAT10acomputerized adaptive testing10agraded response model10aitem selection strategies10amultiple choice items1 aHai-qi, D1 aDe-zhi, C1 aShuliang, D1 aTaiping, D uhttp://iacat.org/content/comparison-among-item-selection-strategies-cat-multiple-choice-items01580nas a2200205 4500008004100000020002200041245005000063210005000113260002600163300001200189490000700201520094200208653003401150653002801184653001901212653002001231653003301251100002301284856006701307 2006 eng d a0146-6216 (Print)00aEquating scores from adaptive to linear tests0 aEquating scores from adaptive to linear tests bSage Publications: US a493-5080 v303 aTwo local methods for observed-score equating are applied to the problem of equating an adaptive test to a linear test. In an empirical study, the methods were evaluated against a method based on the test characteristic function (TCF) of the linear test and traditional equipercentile equating applied to the ability estimates on the adaptive test for a population of test takers. The two local methods were generally best. Surprisingly, the TCF method performed slightly worse than the equipercentile method. Both methods showed strong bias and uniformly large inaccuracy, but the TCF method suffered from extra error due to the lower asymptote of the test characteristic function. It is argued that the worse performances of the two methods are a consequence of the fact that they use a single equating transformation for an entire population of test takers and therefore have to compromise between the individual score distributions. 10acomputerized adaptive testing10aequipercentile equating10alocal equating10ascore reporting10atest characteristic function1 avan der Linden, WJ uhttp://iacat.org/content/equating-scores-adaptive-linear-tests02394nas a2200301 4500008004100000020002200041245010800063210006900171260002400240300000900264490000600273520142200279653002201701653003401723653002301757653003201780653001801812653002101830653001801851100001401869700001301883700001401896700001901910700001501929700001701944700001301961856011801974 2006 eng d a1529-7713 (Print)00aExpansion of a physical function item bank and development of an abbreviated form for clinical research0 aExpansion of a physical function item bank and development of an bRichard M Smith: US a1-150 v73 aWe expanded an existing 33-item physical function (PF) item bank with a sufficient number of items to enable computerized adaptive testing (CAT). Ten items were written to expand the bank and the new item pool was administered to 295 people with cancer. For this analysis of the new pool, seven poorly performing items were identified for further examination. This resulted in a bank with items that define an essentially unidimensional PF construct, cover a wide range of that construct, reliably measure the PF of persons with cancer, and distinguish differences in self-reported functional performance levels. We also developed a 5-item (static) assessment form ("BriefPF") that can be used in clinical research to express scores on the same metric as the overall bank. The BriefPF was compared to the PF-10 from the Medical Outcomes Study SF-36. Both short forms significantly differentiated persons across functional performance levels. While the entire bank was more precise across the PF continuum than either short form, there were differences in the area of the continuum in which each short form was more precise: the BriefPF was more precise than the PF-10 at the lower functional levels and the PF-10 was more precise than the BriefPF at the higher levels. Future research on this bank will include the development of a CAT version, the PF-CAT. (PsycINFO Database Record (c) 2007 APA, all rights reserved)10aclinical research10acomputerized adaptive testing10aperformance levels10aphysical function item bank10aPsychometrics10atest reliability10aTest Validity1 aBode, R K1 aLai, J-S1 aDineen, K1 aHeinemann, A W1 aShevrin, D1 aVon Roenn, J1 aCella, D uhttp://iacat.org/content/expansion-physical-function-item-bank-and-development-abbreviated-form-clinical-research03379nas a2200205 4500008004100000020002200041245009700063210006900160260002500229300001200254490000700266520266000273653003402933653002802967100001502995700001903010700001703029700001403046856011303060 2006 eng d a0439-755X (Print)00a[Item Selection Strategies of Computerized Adaptive Testing based on Graded Response Model.]0 aItem Selection Strategies of Computerized Adaptive Testing based bScience Press: China a461-4670 v383 aItem selection strategy (ISS) is an important component of Computerized Adaptive Testing (CAT). Its performance directly affects the security, efficiency and precision of the test. Thus, ISS becomes one of the central issues in CATs based on the Graded Response Model (GRM). It is well known that the goal of IIS is to administer the next unused item remaining in the item bank that best fits the examinees current ability estimate. In dichotomous IRT models, every item has only one difficulty parameter and the item whose difficulty matches the examinee's current ability estimate is considered to be the best fitting item. However, in GRM, each item has more than two ordered categories and has no single value to represent the item difficulty. Consequently, some researchers have used to employ the average or the median difficulty value across categories as the difficulty estimate for the item. Using the average value and the median value in effect introduced two corresponding ISSs. In this study, we used computer simulation compare four ISSs based on GRM. We also discussed the effect of "shadow pool" on the uniformity of pool usage as well as the influence of different item parameter distributions and different ability estimation methods on the evaluation criteria of CAT. In the simulation process, Monte Carlo method was adopted to simulate the entire CAT process; 1,000 examinees drawn from standard normal distribution and four 1,000-sized item pools of different item parameter distributions were also simulated. The assumption of the simulation is that a polytomous item is comprised of six ordered categories. In addition, ability estimates were derived using two methods. They were expected a posteriori Bayesian (EAP) and maximum likelihood estimation (MLE). In MLE, the Newton-Raphson iteration method and the Fisher Score iteration method were employed, respectively, to solve the likelihood equation. Moreover, the CAT process was simulated with each examinee 30 times to eliminate random error. The IISs were evaluated by four indices usually used in CAT from four aspects--the accuracy of ability estimation, the stability of IIS, the usage of item pool, and the test efficiency. Simulation results showed adequate evaluation of the ISS that matched the estimate of an examinee's current trait level with the difficulty values across categories. Setting "shadow pool" in ISS was able to improve the uniformity of pool utilization. Finally, different distributions of the item parameter and different ability estimation methods affected the evaluation indices of CAT. (PsycINFO Database Record (c) 2007 APA, all rights reserved)10acomputerized adaptive testing10aitem selection strategy1 aPing, Chen1 aShuliang, Ding1 aHaijing, Lin1 aJie, Zhou uhttp://iacat.org/content/item-selection-strategies-computerized-adaptive-testing-based-graded-response-model02107nas a2200229 4500008004100000245013800041210006900179300001400248490000700262520127000269653003101539653003401570653002501604653001701629653001901646653002401665100001401689700001801703700001701721700001901738856012001757 2006 eng d00aSimulated computerized adaptive test for patients with lumbar spine impairments was efficient and produced valid measures of function0 aSimulated computerized adaptive test for patients with lumbar sp a947–9560 v593 aObjective: To equate physical functioning (PF) items with Back Pain Functional Scale (BPFS) items, develop a computerized adaptive test (CAT) designed to assess lumbar spine functional status (LFS) in people with lumbar spine impairments, and compare discriminant validity of LFS measures (qIRT) generated using all items analyzed with a rating scale Item Response Theory model (RSM) and measures generated using the simulated CAT (qCAT).
Methods: We performed a secondary analysis of retrospective intake rehabilitation data.
Results: Unidimensionality and local independence of 25 BPFS and PF items were supported. Differential item functioning was negligible for levels of symptom acuity, gender, age, and surgical history. The RSM fit the data well. A lumbar spine specific CAT was developed
that was 72% more efficient than using all 25 items to estimate LFS measures. qIRT and qCAT measures did not discriminate patients by symptom acuity, age, or gender, but discriminated patients by surgical history in similar clinically logical ways. qCAT measures were as precise as qIRT measures.
Conclusion: A body part specific simulated CAT developed from an LFS item bank was efficient and produced precise measures of LFS without eroding discriminant validity.10aBack Pain Functional Scale10acomputerized adaptive testing10aItem Response Theory10aLumbar spine10aRehabilitation10aTrue-score equating1 aHart, D L1 aMioduski, J E1 aWerneke, M W1 aStratford, P W uhttp://iacat.org/content/simulated-computerized-adaptive-test-patients-lumbar-spine-impairments-was-efficient-and-002068nas a2200217 4500008004500000245013400045210006900179300001200248490000700260520127300267653003401540653004201574653002501616653001901641100001401660700001301674700001801687700001401705700001501719856011601734 2006 Engldsh 00aSimulated computerized adaptive test for patients with shoulder impairments was efficient and produced valid measures of function0 aSimulated computerized adaptive test for patients with shoulder a290-2980 v593 aBackground and Objective: To test unidimensionality and local independence of a set of shoulder functional status (SFS) items,
develop a computerized adaptive test (CAT) of the items using a rating scale item response theory model (RSM), and compare discriminant validity of measures generated using all items (qIRT) and measures generated using the simulated CAT (qCAT).
Study Design and Setting: We performed a secondary analysis of data collected prospectively during rehabilitation of 400 patients
with shoulder impairments who completed 60 SFS items.
Results: Factor analytic techniques supported that the 42 SFS items formed a unidimensional scale and were locally independent. Except for five items, which were deleted, the RSM fit the data well. The remaining 37 SFS items were used to generate the CAT. On average, 6 items on were needed to estimate precise measures of function using the SFS CAT, compared with all 37 SFS items. The qIRT and qCAT measures were highly correlated (r 5 .96) and resulted in similar classifications of patients.
Conclusion: The simulated SFS CAT was efficient and produced precise, clinically relevant measures of functional status with good
discriminating ability.
10acomputerized adaptive testing10aFlexilevel Scale of Shoulder Function10aItem Response Theory10aRehabilitation1 aHart, D L1 aCook, KF1 aMioduski, J E1 aTeal, C R1 aCrane, P K uhttp://iacat.org/content/simulated-computerized-adaptive-test-patients-shoulder-impairments-was-efficient-and-002237nas a2200217 4500008004100000020002200041245014200063210006900205260004100274300001200315490000700327520142600334653001401760653003401774653002301808653001601831653002701847100001201874700001701886856011601903 2005 eng d a0022-0655 (Print)00aIncreasing the homogeneity of CAT's item-exposure rates by minimizing or maximizing varied target functions while assembling shadow tests0 aIncreasing the homogeneity of CATs itemexposure rates by minimiz bBlackwell Publishing: United Kingdom a245-2690 v423 aA computerized adaptive testing (CAT) algorithm that has the potential to increase the homogeneity of CATs item-exposure rates without significantly sacrificing the precision of ability estimates was proposed and assessed in the shadow-test (van der Linden & Reese, 1998) CAT context. This CAT algorithm was formed by a combination of maximizing or minimizing varied target functions while assembling shadow tests. There were four target functions to be separately used in the first, second, third, and fourth quarter test of CAT. The elements to be used in the four functions were associated with (a) a random number assigned to each item, (b) the absolute difference between an examinee's current ability estimate and an item difficulty, (c) the absolute difference between an examinee's current ability estimate and an optimum item difficulty, and (d) item information. The results indicated that this combined CAT fully utilized all the items in the pool, reduced the maximum exposure rates, and achieved more homogeneous exposure rates. Moreover, its precision in recovering ability estimates was similar to that of the maximum item-information method. The combined CAT method resulted in the best overall results compared with the other individual CAT item-selection methods. The findings from the combined CAT are encouraging. Future uses are discussed. (PsycINFO Database Record (c) 2007 APA, all rights reserved)10aalgorithm10acomputerized adaptive testing10aitem exposure rate10ashadow test10avaried target function1 aLi, Y H1 aSchafer, W D uhttp://iacat.org/content/increasing-homogeneity-cats-item-exposure-rates-minimizing-or-maximizing-varied-target01739nas a2200229 4500008004100000245008300041210006900124300001100193490000700204520103000211653003401241100001301275700001401288700001501302700001701317700001501334700001501349700001401364700001301378700001301391856010501404 2005 eng d00aAn item response theory-based pain item bank can enhance measurement precision0 aitem response theorybased pain item bank can enhance measurement a278-880 v303 aCancer-related pain is often under-recognized and undertreated. This is partly due to the lack of appropriate assessments, which need to be comprehensive and precise yet easily integrated into clinics. Computerized adaptive testing (CAT) can enable precise-yet-brief assessments by only selecting the most informative items from a calibrated item bank. The purpose of this study was to create such a bank. The sample included 400 cancer patients who were asked to complete 61 pain-related items. Data were analyzed using factor analysis and the Rasch model. The final bank consisted of 43 items which satisfied the measurement requirement of factor analysis and the Rasch model, demonstrated high internal consistency and reasonable item-total correlations, and discriminated patients with differing degrees of pain. We conclude that this bank demonstrates good psychometric properties, is sensitive to pain reported by patients, and can be used as the foundation for a CAT pain-testing platform for use in clinical practice.10acomputerized adaptive testing1 aLai, J-S1 aDineen, K1 aReeve, B B1 aVon Roenn, J1 aShervin, D1 aMcGuire, M1 aBode, R K1 aPaice, J1 aCella, D uhttp://iacat.org/content/item-response-theory-based-pain-item-bank-can-enhance-measurement-precision01910nas a2200157 4500008004100000245010500041210006900146300001000215490000700225520132900232653003401561100001501595700001301610700001301623856011601636 2005 eng d00aThe promise of PROMIS: using item response theory to improve assessment of patient-reported outcomes0 apromise of PROMIS using item response theory to improve assessme aS53-70 v233 aPROMIS (Patient-Reported-Outcomes Measurement Information System) is an NIH Roadmap network project intended to improve the reliability, validity, and precision of PROs and to provide definitive new instruments that will exceed the capabilities of classic instruments and enable improved outcome measurement for clinical research across all NIH institutes. Item response theory (IRT) measurement models now permit us to transition conventional health status assessment into an era of item banking and computerized adaptive testing (CAT). Item banking uses IRT measurement models and methods to develop item banks from large pools of items from many available questionnaires. IRT allows the reduction and improvement of items and assembles domains of items which are unidimensional and not excessively redundant. CAT provides a model-driven algorithm and software to iteratively select the most informative remaining item in a domain until a desired degree of precision is obtained. Through these approaches the number of patients required for a clinical trial may be reduced while holding statistical power constant. PROMIS tools, expected to improve precision and enable assessment at the individual patient level which should broaden the appeal of PROs, will begin to be available to the general medical community in 2008.10acomputerized adaptive testing1 aFries, J F1 aBruce, B1 aCella, D uhttp://iacat.org/content/promise-promis-using-item-response-theory-improve-assessment-patient-reported-outcomes01539nas a2200229 4500008004100000020002200041245006400063210006300127260002600190300001200216490000700228520081800235653003401053653003001087653002801117653001301145100001901158700001501177700001601192700001701208856008401225 2004 eng d a0146-6216 (Print)00aComputerized adaptive testing with multiple-form structures0 aComputerized adaptive testing with multipleform structures bSage Publications: US a147-1640 v283 aA multiple-form structure (MFS) is an ordered collection or network of testlets (i.e., sets of items). An examinee's progression through the network of testlets is dictated by the correctness of an examinee's answers, thereby adapting the test to his or her trait level. The collection of paths through the network yields the set of all possible test forms, allowing test specialists the opportunity to review them before they are administered. Also, limiting the exposure of an individual MFS to a specific period of time can enhance test security. This article provides an overview of methods that have been developed to generate parallel MFSs. The approach is applied to the assembly of an experimental computerized Law School Admission Test (LSAT). (PsycINFO Database Record (c) 2007 APA, all rights reserved)10acomputerized adaptive testing10aLaw School Admission Test10amultiple-form structure10atestlets1 aArmstrong, R D1 aJones, D H1 aKoppel, N B1 aPashley, P J uhttp://iacat.org/content/computerized-adaptive-testing-multiple-form-structures01608nas a2200217 4500008004100000020002200041245008200063210006900145260004300214300001200257490000700269520084600276653003401122653002601156653003501182653001601217653001701233100002301250700001801273856009901291 2004 eng d a1076-9986 (Print)00aConstraining item exposure in computerized adaptive testing with shadow tests0 aConstraining item exposure in computerized adaptive testing with bAmerican Educational Research Assn: US a273-2910 v293 aItem-exposure control in computerized adaptive testing is implemented by imposing item-ineligibility constraints on the assembly process of the shadow tests. The method resembles Sympson and Hetter’s (1985) method of item-exposure control in that the decisions to impose the constraints are probabilistic. The method does not, however, require time-consuming simulation studies to set values for control parameters before the operational use of the test. Instead, it can set the probabilities of item ineligibility adaptively during the test using the actual item-exposure rates. An empirical study using an item pool from the Law School Admission Test showed that application of the method yielded perfect control of the item-exposure rates and had negligible impact on the bias and mean-squared error functions of the ability estimator. 10acomputerized adaptive testing10aitem exposure control10aitem ineligibility constraints10aProbability10ashadow tests1 avan der Linden, WJ1 aVeldkamp, B P uhttp://iacat.org/content/constraining-item-exposure-computerized-adaptive-testing-shadow-tests00542nas a2200145 4500008004100000245008900041210006900130300001200199490000700211653003400218100001400252700001400266700001500280856010100295 2004 eng d00aThe development and evaluation of a software prototype for computer-adaptive testing0 adevelopment and evaluation of a software prototype for computera a109-1230 v4310acomputerized adaptive testing1 aLilley, M1 aBarker, T1 aBritton, C uhttp://iacat.org/content/development-and-evaluation-software-prototype-computer-adaptive-testing01989nas a2200193 4500008004100000020002200041245011400063210006900177260004100246300001200287490000700299520125600306653003401562653002501596653002601621100001401647700001901661856011501680 2004 eng d a0022-0655 (Print)00aEffects of practical constraints on item selection rules at the early stages of computerized adaptive testing0 aEffects of practical constraints on item selection rules at the bBlackwell Publishing: United Kingdom a149-1740 v413 aThe purpose of this study was to compare the effects of four item selection rules--(1) Fisher information (F), (2) Fisher information with a posterior distribution (FP), (3) Kullback-Leibler information with a posterior distribution (KP), and (4) completely randomized item selection (RN)--with respect to the precision of trait estimation and the extent of item usage at the early stages of computerized adaptive testing. The comparison of the four item selection rules was carried out under three conditions: (1) using only the item information function as the item selection criterion; (2) using both the item information function and content balancing; and (3) using the item information function, content balancing, and item exposure control. When test length was less than 10 items, FP and KP tended to outperform F at extreme trait levels in Condition 1. However, in more realistic settings, it could not be concluded that FP and KP outperformed F, especially when item exposure control was imposed. When test length was greater than 10 items, the three nonrandom item selection procedures performed similarly no matter what the condition was, while F had slightly higher item usage. (PsycINFO Database Record (c) 2007 APA, all rights reserved)10acomputerized adaptive testing10aitem selection rules10apractical constraints1 aChen, S-Y1 aAnkenmann, R D uhttp://iacat.org/content/effects-practical-constraints-item-selection-rules-early-stages-computerized-adaptive00705nas a2200157 4500008004100000245013900041210006900180260003200249653003400281653004000315653004100355100001200396700001200408700001200420856011500432 2004 eng d00aAn investigation of two combination procedures of SPRT for three-category classification decisions in computerized classification test0 ainvestigation of two combination procedures of SPRT for threecat aSan Antonio, Texasc04/200410acomputerized adaptive testing10aComputerized classification testing10asequential probability ratio testing1 aJiao, H1 aWang, S1 aLau, CA uhttp://iacat.org/content/investigation-two-combination-procedures-sprt-three-category-classification-decisions00537nas a2200181 4500008004100000245005000041210004800091300001000139490000700149653003400156100001400190700001500204700001500219700001400234700002600248700001300274856006800287 2004 eng d00aSiette: a web-based tool for adaptive testing0 aSiette a webbased tool for adaptive testing a29-610 v1410acomputerized adaptive testing1 aConejo, R1 aGuzmán, E1 aMillán, E1 aTrella, M1 aPérez-De-La-Cruz, JL1 aRíos, A uhttp://iacat.org/content/siette-web-based-tool-adaptive-testing01891nas a2200181 4500008004100000020002200041245012000063210006900183260002600252300001200278490000700290520119300297653003401490653003701524653001801561100001401579856011601593 2004 eng d a0146-6216 (Print)00aStrategies for controlling item exposure in computerized adaptive testing with the generalized partial credit model0 aStrategies for controlling item exposure in computerized adaptiv bSage Publications: US a165-1850 v283 aChoosing a strategy for controlling item exposure has become an integral part of test development for computerized adaptive testing (CAT). This study investigated the performance of six procedures for controlling item exposure in a series of simulated CATs under the generalized partial credit model. In addition to a no-exposure control baseline condition, the randomesque, modified-within-.10-logits, Sympson-Hetter, conditional Sympson-Hetter, a-stratified with multiple-stratification, and enhanced a-stratified with multiple-stratification procedures were implemented to control exposure rates. Two variations of the randomesque and modified-within-.10-logits procedures were examined, which varied the size of the item group from which the next item to be administered was randomly selected. The results indicate that although the conditional Sympson-Hetter provides somewhat lower maximum exposure rates, the randomesque and modified-within-.10-logits procedures with the six-item group variation have great utility for controlling overlap rates and increasing pool utilization and should be given further consideration. (PsycINFO Database Record (c) 2007 APA, all rights reserved)10acomputerized adaptive testing10ageneralized partial credit model10aitem exposure1 aDavis, LL uhttp://iacat.org/content/strategies-controlling-item-exposure-computerized-adaptive-testing-generalized-partial01587nas a2200145 4500008004100000245005200041210005200093300001200145490000700157520113200164653003401296100001601330700002301346856007201369 2003 eng d00aComputerized adaptive testing with item cloning0 aComputerized adaptive testing with item cloning a247-2610 v273 a(from the journal abstract) To increase the number of items available for adaptive testing and reduce the cost of item writing, the use of techniques of item cloning has been proposed. An important consequence of item cloning is possible variability between the item parameters. To deal with this variability, a multilevel item response (IRT) model is presented which allows for differences between the distributions of item parameters of families of item clones. A marginal maximum likelihood and a Bayesian procedure for estimating the hyperparameters are presented. In addition, an item-selection procedure for computerized adaptive testing with item cloning is presented which has the following two stages: First, a family of item clones is selected to be optimal at the estimate of the person parameter. Second, an item is randomly selected from the family for administration. Results from simulation studies based on an item pool from the Law School Admission Test (LSAT) illustrate the accuracy of these item pool calibration and adaptive testing procedures. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aGlas, C A W1 avan der Linden, WJ uhttp://iacat.org/content/computerized-adaptive-testing-item-cloning00957nas a2200157 4500008004100000245011200041210006900153300001100222490000700233520035900240653003400599100001500633700001900648700001300667856011900680 2003 eng d00aIncorporation of Content Balancing Requirements in Stratification Designs for Computerized Adaptive Testing0 aIncorporation of Content Balancing Requirements in Stratificatio a257-700 v633 aStudied three stratification designs for computerized adaptive testing in conjunction with three well-developed content balancing methods. Simulation study results show substantial differences in item overlap rate and pool utilization among different methods. Recommends an optimal combination of stratification design and content balancing method. (SLD)10acomputerized adaptive testing1 aLeung, C-K1 aChang, Hua-Hua1 aHau, K-T uhttp://iacat.org/content/incorporation-content-balancing-requirements-stratification-designs-computerized-adaptive00516nas a2200169 4500008004100000245003700041210003700078260004900115300001400164653003400178100001800212700001300230700001700243700001200260700001500272856005900287 2003 eng d00aItem selection in polytomous CAT0 aItem selection in polytomous CAT aTokyo, JapanbPsychometric Society, Springer a207–21410acomputerized adaptive testing1 aVeldkamp, B P1 aOkada, A1 aShigenasu, K1 aKano, Y1 aMeulman, J uhttp://iacat.org/content/item-selection-polytomous-cat01512nas a2200229 4500008004100000245008700041210006900128300001200197490000700209520075300216653002100969653001300990653003001003653003401033653001101067653001501078653001501093653001801108100002301126700002701149856010601176 2003 eng d00aUsing response times to detect aberrant responses in computerized adaptive testing0 aUsing response times to detect aberrant responses in computerize a251-2650 v683 aA lognormal model for response times is used to check response times for aberrances in examinee behavior on computerized adaptive tests. Both classical procedures and Bayesian posterior predictive checks are presented. For a fixed examinee, responses and response times are independent; checks based on response times offer thus information independent of the results of checks on response patterns. Empirical examples of the use of classical and Bayesian checks for detecting two different types of aberrances in response times are presented. The detection rates for the Bayesian checks outperformed those for the classical checks, but at the cost of higher false-alarm rates. A guideline for the choice between the two types of checks is offered.10aAdaptive Testing10aBehavior10aComputer Assisted Testing10acomputerized adaptive testing10aModels10aperson Fit10aPrediction10aReaction Time1 avan der Linden, WJ1 aKrimpen-Stoop, E M L A uhttp://iacat.org/content/using-response-times-detect-aberrant-responses-computerized-adaptive-testing01498nas a2200133 4500008004100000245011800041210006900159300000900228490000700237520094700244653003401191100001801225856012101243 2002 eng d00aComputer adaptive testing: The impact of test characteristics on perceived performance and test takers' reactions0 aComputer adaptive testing The impact of test characteristics on a34100 v623 aThis study examined the relationship between characteristics of adaptive testing and test takers' subsequent reactions to the test. Participants took a computer adaptive test in which two features, the difficulty of the initial item and the difficulty of subsequent items, were manipulated. These two features of adaptive testing determined the number of items answered correctly by examinees and their subsequent reactions to the test. The data show that the relationship between test characteristics and reactions was fully mediated by perceived performance on the test. In addition, the impact of feedback on reactions to adaptive testing was also evaluated. In general, feedback that was consistent with perceptions of performance had a positive impact on reactions to the test. Implications for adaptive test design concerning maximizing test takers' reactions are discussed. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aTonidandel, S uhttp://iacat.org/content/computer-adaptive-testing-impact-test-characteristics-perceived-performance-and-test-takers00685nas a2200145 4500008004100000245003400041210003400075300001100109490000700120520029200127653003400419100001200453700001500465856005900480 2002 eng d00aComputerised adaptive testing0 aComputerised adaptive testing a619-220 v333 aConsiders the potential of computer adaptive testing (CAT). Discusses the use of CAT instead of traditional paper and pencil tests, identifies decisions that impact the efficacy of CAT, and concludes that CAT is beneficial when used to its full potential on certain types of tests. (LRW)10acomputerized adaptive testing1 aLatu, E1 aChapman, E uhttp://iacat.org/content/computerised-adaptive-testing02773nas a2200133 4500008004100000245009800041210006900139300000900208490000700217520225500224653003402479100001602513856011002529 2002 eng d00aThe effect of test characteristics on aberrant response patterns in computer adaptive testing0 aeffect of test characteristics on aberrant response patterns in a33630 v623 aThe advantages that computer adaptive testing offers over linear tests have been well documented. The Computer Adaptive Test (CAT) design is more efficient than the Linear test design as fewer items are needed to estimate an examinee's proficiency to a desired level of precision. In the ideal situation, a CAT will result in examinees answering different number of items according to the stopping rule employed. Unfortunately, the realities of testing conditions have necessitated the imposition of time and minimum test length limits on CATs. Such constraints might place a burden on the CAT test taker resulting in aberrant response behaviors by some examinees. Occurrence of such response patterns results in inaccurate estimation of examinee proficiency levels. This study examined the effects of test lengths, time limits and the interaction of these factors with the examinee proficiency levels on the occurrence of aberrant response patterns. The focus of the study was on the aberrant behaviors caused by rushed guessing due to restrictive time limits. Four different testing scenarios were examined; fixed length performance tests with and without content constraints, fixed length mastery tests and variable length mastery tests without content constraints. For each of these testing scenarios, the effect of two test lengths, five different timing conditions and the interaction between these factors with three ability levels on ability estimation were examined. For fixed and variable length mastery tests, decision accuracy was also looked at in addition to the estimation accuracy. Several indices were used to evaluate the estimation and decision accuracy for different testing conditions. The results showed that changing time limits had a significant impact on the occurrence of aberrant response patterns conditional on ability. Increasing test length had negligible if not negative effect on ability estimation when rushed guessing occured. In case of performance testing high ability examinees while in classification testing middle ability examinees suffered the most. The decision accuracy was considerably affected in case of variable length classification tests. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aRizavi, S M uhttp://iacat.org/content/effect-test-characteristics-aberrant-response-patterns-computer-adaptive-testing00524nas a2200109 4500008004100000245010600041210006900147260002500216653003400241100001900275856012000294 2002 eng d00aAn empirical comparison of achievement level estimates from adaptive tests and paper-and-pencil tests0 aempirical comparison of achievement level estimates from adaptiv aNew Orleans, LA. USA10acomputerized adaptive testing1 aKingsbury, G G uhttp://iacat.org/content/empirical-comparison-achievement-level-estimates-adaptive-tests-and-paper-and-pencil-tests01306nas a2200169 4500008004100000245009500041210006900136300001200205490000700217520070700224653003400931100001400965700001600979700001600995700001701011856010801028 2002 eng d00aEvaluation of selection procedures for computerized adaptive testing with polytomous items0 aEvaluation of selection procedures for computerized adaptive tes a393-4110 v263 aIn the present study, a procedure that has been used to select dichotomous items in computerized adaptive testing was applied to polytomous items. This procedure was designed to select the item with maximum weighted information. In a simulation study, the item information function was integrated over a fixed interval of ability values and the item with the maximum area was selected. This maximum interval information item selection procedure was compared to a maximum point information item selection procedure. Substantial differences between the two item selection procedures were not found when computerized adaptive tests were evaluated on bias and the root mean square of the ability estimate. 10acomputerized adaptive testing1 aRijn, P W1 aEggen, Theo1 aHemker, B T1 aSanders, P F uhttp://iacat.org/content/evaluation-selection-procedures-computerized-adaptive-testing-polytomous-items02787nas a2200133 4500008004100000245010200041210006900143300000900212490000700221520226500228653003402493100002002527856010602547 2002 eng d00aThe implications of the use of non-optimal items in a Computer Adaptive Testing (CAT) environment0 aimplications of the use of nonoptimal items in a Computer Adapti a16060 v633 aThis study describes the effects of manipulating item difficulty in a computer adaptive testing (CAT) environment. There are many potential benefits when using CATS as compared to traditional tests. These include increased security, shorter tests, and more precise measurement. According to IRT, the theory underlying CAT, as the computer continually recalculates ability, items that match that current estimate of ability are administered. Such items provide maximum information about examinees during the test. Herein, however, lies a potential problem. These optimal CAT items result in an examinee having only a 50% chance of a correct response. Some examinees may consider such items unduly challenging. Further, when test anxiety is a factor, it is possible that test scores may be negatively affected. This research was undertaken to determine the effects of administering easier CAT items on ability estimation and test length using computer simulations. Also considered was the administration of different numbers of initial items prior to the start of the adaptive portion of the test, using three different levels of measurement precision. Results indicate that regardless of the number of initial items administered, the level of precision employed, or the modifications made to item difficulty, the approximation of estimated ability to true ability is good in all cases. Additionally, the standard deviations of the ability estimates closely approximate the theoretical levels of precision used as stopping rules for the simulated CATs. Since optimal CAT items are not used, each item administered provides less information about examinees than optimal CAT items. This results in longer tests. Fortunately, using easier items that provide up to a 66.4% chance of a correct response results in tests that only modestly increase in length, across levels of precision. For larger standard errors, even easier items (up to a 73.5% chance of a correct response) result in only negligible to modest increases in test length. Examinees who find optimal CAT items difficult or examinees with test anxiety may find CATs that implement easier items enhance the already existing benefits of CAT. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aGrodenchik, D J uhttp://iacat.org/content/implications-use-non-optimal-items-computer-adaptive-testing-cat-environment01179nas a2200133 4500008004100000245006200041210005900103300001200162490000700174520073400181653003400915100001600949856008000965 2002 eng d00aAn item response model for characterizing test compromise0 aitem response model for characterizing test compromise a163-1790 v273 aThis article presents an item response model for characterizing test-compromise that enables the estimation of item-preview and score-gain distributions observed in on-demand high-stakes testing programs. Model parameters and posterior distributions are estimated by Markov Chain Monte Carlo (MCMC) procedures. Results of a simulation study suggest that when at least some of the items taken by a small sample of test takers are known to be secure (uncompromised), the procedure can provide useful summaries of test-compromise and its impact on test scores. The article includes discussions of operational use of the proposed procedure, possible model violations and extensions, and application to computerized adaptive testing. 10acomputerized adaptive testing1 aSegall, D O uhttp://iacat.org/content/item-response-model-characterizing-test-compromise01627nas a2200241 4500008004100000245005900041210005800100300001200158490000700170520087100177653002101048653003401069653002801103653002001131653003201151653002501183653001501208653002701223653002201250653001601272100001601288856008101304 2002 eng d00aOutlier detection in high-stakes certification testing0 aOutlier detection in highstakes certification testing a219-2330 v393 aDiscusses recent developments of person-fit analysis in computerized adaptive testing (CAT). Methods from statistical process control are presented that have been proposed to classify an item score pattern as fitting or misfitting the underlying item response theory model in CAT Most person-fit research in CAT is restricted to simulated data. In this study, empirical data from a certification test were used. Alternatives are discussed to generate norms so that bounds can be determined to classify an item score pattern as fitting or misfitting. Using bounds determined from a sample of a high-stakes certification test, the empirical analysis showed that different types of misfit can be distinguished Further applications using statistical process control methods to detect misfitting item score patterns are discussed. (PsycINFO Database Record (c) 2005 APA )10aAdaptive Testing10acomputerized adaptive testing10aEducational Measurement10aGoodness of Fit10aItem Analysis (Statistical)10aItem Response Theory10aperson Fit10aStatistical Estimation10aStatistical Power10aTest Scores1 aMeijer, R R uhttp://iacat.org/content/outlier-detection-high-stakes-certification-testing02220nas a2200145 4500008004100000245011600041210006900157300001100226490000600237520165200243653003401895100001301929700001601942856011601958 2001 eng d00aAssessment in the twenty-first century: A role of computerised adaptive testing in national curriculum subjects0 aAssessment in the twentyfirst century A role of computerised ada a241-570 v53 aWith the investment of large sums of money in new technologies forschools and education authorities and the subsequent training of teachers to integrate Information and Communications Technology (ICT) into their teaching strategies, it is remarkable that the old outdated models of assessment still remain. This article highlights the current problems associated with pen-and paper-testing and offers suggestions for an innovative and new approach to assessment for the twenty-first century. Based on the principle of the 'wise examiner' a computerised adaptive testing system which measures pupils' ability against the levels of the United Kingdom National Curriculum has been developed for use in mathematics. Using constructed response items, pupils are administered a test tailored to their ability with a reliability index of 0.99. Since the software administers maximally informative questions matched to each pupil's current ability estimate, no two pupils will receive the same set of items in the same order therefore removing opportunities for plagarism and teaching to the test. All marking is automated and a journal recording the outcome of the test and highlighting the areas of difficulty for each pupil is available for printing by the teacher. The current prototype of the system can be used on a school's network however the authors envisage a day when Examination Boards or the Qualifications and Assessment Authority (QCA) will administer Government tests from a central server to all United Kingdom schools or testing centres. Results will be issued at the time of testing and opportunities for resits will become more widespr10acomputerized adaptive testing1 aCowan, P1 aMorrison, H uhttp://iacat.org/content/assessment-twenty-first-century-role-computerised-adaptive-testing-national-curriculum00838nas a2200157 4500008004100000245007400041210006900115300001100184490000700195520030900202653003400511100001900545700001200564700001200576856009200588 2001 eng d00aa-stratified multistage computerized adaptive testing with b blocking0 aastratified multistage computerized adaptive testing with b bloc a333-410 v253 aProposed a refinement, based on the stratification of items developed by D. Weiss (1973), of the computerized adaptive testing item selection procedure of H. Chang and Z. Ying (1999). Simulation studies using an item bank from the Graduate Record Examination show the benefits of the new procedure. (SLD)10acomputerized adaptive testing1 aChang, Hua-Hua1 aQian, J1 aYang, Z uhttp://iacat.org/content/stratified-multistage-computerized-adaptive-testing-b-blocking00678nas a2200133 4500008004100000245001800041210001700059300001000076490000800086520036100094653003400455100001300489856004200502 2001 eng d00aFinal answer?0 aFinal answer a24-260 v1883 aThe Northwest Evaluation Association helped an Indiana school district develop a computerized adaptive testing system that was aligned with its curriculum and geared toward measuring individual student growth. Now the district can obtain such information from semester to semester and year to year, get immediate results, and test students on demand. (MLH)10acomputerized adaptive testing1 aCoyle, J uhttp://iacat.org/content/final-answer03148nas a2200133 4500008004100000245007900041210006900120300000900189490000700198520266000205653003402865100001502899856010002914 2001 eng d00aMultidimensional adaptive testing using the weighted likelihood estimation0 aMultidimensional adaptive testing using the weighted likelihood a47460 v613 aThis study extended Warm's (1989) weighted likelihood estimation (WLE) to a multidimensional computerized adaptive test (MCAT) setting. WLE was compared with the maximum likelihood estimation (MLE), expected a posteriori (EAP), and maximum a posteriori (MAP) using a three-dimensional 3PL IRT model under a variety of computerized adaptive testing conditions. The dependent variables included bias, standard error of ability estimates (SE), square root of mean square error (RMSE), and test information. The independent variables were ability estimation methods, intercorrelation levels between dimensions, multidimensional structures, and ability combinations. Simulation results were presented in terms of descriptive statistics, such as figures and tables. In addition, inferential procedures were used to analyze bias by conceptualizing this Monte Carlo study as a statistical sampling experiment. The results of this study indicate that WLE and the other three estimation methods yield significantly more accurate ability estimates under an approximate simple test structure with one dominant dimension and several secondary dimensions. All four estimation methods, especially WLE, yield very large SEs when a three equally dominant multidimensional structure was employed. Consistent with previous findings based on unidimensional IRT model, MLE and WLE are less biased in the extreme of the ability scale; MLE and WLE yield larger SEs than the Bayesian methods; test information-based SEs underestimate actual SEs for both MLE and WLE in MCAT situations, especially at shorter test lengths; WLE reduced the bias of MLE under the approximate simple structure; test information-based SEs underestimates the actual SEs of MLE and WLE estimators in the MCAT conditions, similar to the findings of Warm (1989) in the unidimensional case. The results from the MCAT simulations did show some advantages of WLE in reducing the bias of MLE under the approximate simple structure with a fixed test length of 50 items, which was consistent with the previous research findings based on different unidimensional models. It is clear from the current results that all four methods perform very poorly when the multidimensional structures with multiple dominant factors were employed. More research efforts are urged to investigate systematically how different multidimensional structures affect the accuracy and reliability of ability estimation. Based on the simulated results in this study, there is no significant effect found on the ability estimation from the intercorrelation between dimensions. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aTseng, F-L uhttp://iacat.org/content/multidimensional-adaptive-testing-using-weighted-likelihood-estimation02333nas a2200157 4500008004100000020001400041245019300055210006900248300001200317490000700329520164900336653003401985100001402019700001502033856012702048 2001 eng d a0214-991500aPasado, presente y futuro de los test adaptativos informatizados: Entrevista con Isaac I. Béjar [Past, present and future of computerized adaptive testing: Interview with Isaac I. Béjar]0 aPasado presente y futuro de los test adaptativos informatizados a685-6900 v133 aEn este artículo se presenta el resultado de una entrevista con Isaac I. Bejar. El Dr. Bejar es actualmente Investigador Científico Principal y Director del Centro para el Diseño de Evaluación y Sistemas de Puntuación perteneciente a la División de Investigación del Servicio de Medición Educativa (Educa - tional Testing Service, Princeton, NJ, EE.UU.). El objetivo de esta entrevista fue conversar sobre el pasado, presente y futuro de los Tests Adaptativos Informatizados. En la entrevista se recogen los inicios de los Tests Adaptativos y de los Tests Adaptativos Informatizados y últimos avances que se desarrollan en el Educational Testing Service sobre este tipo de tests (modelos generativos, isomorfos, puntuación automática de ítems de ensayo…). Se finaliza con la visión de futuro de los Tests Adaptativos Informatizados y su utilización en España.Past, present and future of Computerized Adaptive Testing: Interview with Isaac I. Bejar. In this paper the results of an interview with Isaac I. Bejar are presented. Dr. Bejar is currently Principal Research Scientist and Director of Center for Assessment Design and Scoring, in Research Division at Educational Testing Service (Princeton, NJ, U.S.A.). The aim of this interview was to review the past, present and future of the Computerized Adaptive Tests. The beginnings of the Adaptive Tests and Computerized Adaptive Tests, and the latest advances developed at the Educational Testing Service (generative response models, isomorphs, automated scoring…) are reviewed. The future of Computerized Adaptive Tests is analyzed, and its utilization in Spain commented.10acomputerized adaptive testing1 aTejada, R1 aAntonio, J uhttp://iacat.org/content/pasado-presente-y-futuro-de-los-test-adaptativos-informatizados-entrevista-con-isaac-i-b%C3%A9jar02438nas a2200169 4500008004100000245012400041210007300165300001000238490000700248520174900255653003402004100002002038700001802058700002202076700002202098856014802120 2000 eng d00aAlgoritmo mixto mínima entropía-máxima información para la selección de ítems en un test adaptativo informatizado0 aAlgoritmo mixto mínima entropíamáxima información para la selecc a12-140 v123 aEl objetivo del estudio que presentamos es comparar la eficacia como estrat egia de selección de ítems de tres algo ritmos dife rentes: a) basado en máxima info rmación; b) basado en mínima entropía; y c) mixto mínima entropía en los ítems iniciales y máxima info rmación en el resto; bajo la hipótesis de que el algo ritmo mixto, puede dotar al TAI de mayor eficacia. Las simulaciones de procesos TAI se re a l i z a ron sobre un banco de 28 ítems de respuesta graduada calibrado según el modelo de Samejima, tomando como respuesta al TAI la respuesta ori ginal de los sujetos que fueron utilizados para la c a l i b ración. Los resultados iniciales mu e s t ran cómo el cri t e rio mixto es más eficaz que cualquiera de los otros dos tomados indep e n d i e n t e m e n t e. Dicha eficacia se maximiza cuando el algo ritmo de mínima entropía se re s t ri n ge a la selección de los pri m e ros ítems del TAI, ya que con las respuestas a estos pri m e ros ítems la estimación de q comienza a ser re l evante y el algo ritmo de máxima informaciónse optimiza.Item selection algo rithms in computeri zed adap t ive testing. The aim of this paper is to compare the efficacy of three different item selection algo rithms in computeri zed adap t ive testing (CAT). These algorithms are based as follows: the first one is based on Item Info rm ation, the second one on Entropy, and the last algo rithm is a mixture of the two previous ones. The CAT process was simulated using an emotional adjustment item bank. This item bank contains 28 graded items in six categories , calibrated using Samejima (1969) Graded Response Model. The initial results show that the mixed criterium algorithm performs better than the other ones.10acomputerized adaptive testing1 aDorronsoro, J R1 aSanta-Cruz, C1 aRubio Franco, V J1 aAguado García, D uhttp://iacat.org/content/algoritmo-mixto-m%C3%ADnima-entrop%C3%ADa-m%C3%A1xima-informaci%C3%B3n-para-la-selecci%C3%B3n-de-%C3%ADtems-en-un-test01256nas a2200145 4500008004100000245006500041210006500106300001000171490000700181520076500188653003400953100002300987700001601010856008401026 2000 eng d00aCapitalization on item calibration error in adaptive testing0 aCapitalization on item calibration error in adaptive testing a35-530 v133 a(from the journal abstract) In adaptive testing, item selection is sequentially optimized during the test. Because the optimization takes place over a pool of items calibrated with estimation error, capitalization on chance is likely to occur. How serious the consequences of this phenomenon are depends not only on the distribution of the estimation errors in the pool or the conditional ratio of the test length to the pool size given ability, but may also depend on the structure of the item selection criterion used. A simulation study demonstrated a dramatic impact of capitalization on estimation errors on ability estimation. Four different strategies to minimize the likelihood of capitalization on error in computerized adaptive testing are discussed.10acomputerized adaptive testing1 avan der Linden, WJ1 aGlas, C A W uhttp://iacat.org/content/capitalization-item-calibration-error-adaptive-testing02650nas a2200133 4500008004100000245007300041210006900114300000900183490000700192520217300199653003402372100001702406856009302423 2000 eng d00aA comparison of computerized adaptive testing and multistage testing0 acomparison of computerized adaptive testing and multistage testi a58290 v603 aThere is considerable evidence to show that computerized-adaptive testing (CAT) and multi-stage testing (MST) are viable frameworks for testing. With many testing organizations looking to move towards CAT or MST, it is important to know what framework is superior in different situations and at what cost in terms of measurement. What was needed is a comparison of the different testing procedures under various realistic testing conditions. This dissertation addressed the important problem of the increase or decrease in accuracy of ability estimation in using MST rather than CAT. The purpose of this study was to compare the accuracy of ability estimates produced by MST and CAT while keeping some variables fixed and varying others. A simulation study was conducted to investigate the effects of several factors on the accuracy of ability estimation using different CAT and MST designs. The factors that were manipulated are the number of stages, the number of subtests per stage, and the number of items per subtest. Kept constant were test length, distribution of subtest information, method of determining cut-points on subtests, amount of overlap between subtests, and method of scoring total test. The primary question of interest was, given a fixed test length, how many stages and many subtests per stage should there be to maximize measurement precision? Furthermore, how many items should there be in each subtest? Should there be more in the routing test or should there be more in the higher stage tests? Results showed that, in general, increasing the number of stages from two to three decreased the amount of errors in ability estimation. Increasing the number of subtests from three to five increased the accuracy of ability estimates as well as the efficiency of the MST designs relative to the P&P and CAT designs at most ability levels (-.75 to 2.25). Finally, at most ability levels (-.75 to 2.25), varying the number of items per stage had little effect on either the resulting accuracy of ability estimates or the relative efficiency of the MST designs to the P&P and CAT designs. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aPatsula, L N uhttp://iacat.org/content/comparison-computerized-adaptive-testing-and-multistage-testing00540nas a2200157 4500008004100000245006500041210006300106260002700169490000700196653003400203100001700237700001200254700001200266700001700278856008700295 2000 eng d00aComputer-adaptive testing: A methodology whose time has come0 aComputeradaptive testing A methodology whose time has come aChicago, IL. USAbMESA0 v6910acomputerized adaptive testing1 aLinacre, J M1 aKang, U1 aJean, E1 aLinacre, J M uhttp://iacat.org/content/computer-adaptive-testing-methodology-whose-time-has-come01594nas a2200157 4500008004100000245008200041210006900123300001100192490000700203520101400210653003401224653004001258100001601298700002401314856009801338 2000 eng d00aComputerized adaptive testing for classifying examinees into three categories0 aComputerized adaptive testing for classifying examinees into thr a713-340 v603 aThe objective of this study was to explore the possibilities for using computerized adaptive testing in situations in which examinees are to be classified into one of three categories.Testing algorithms with two different statistical computation procedures are described and evaluated. The first computation procedure is based on statistical testing and the other on statistical estimation. Item selection methods based on maximum information (MI) considering content and exposure control are considered. The measurement quality of the proposed testing algorithms is reported. The results of the study are that a reduction of at least 22% in the mean number of items can be expected in a computerized adaptive test (CAT) compared to an existing paper-and-pencil placement test. Furthermore, statistical testing is a promising alternative to statistical estimation. Finally, it is concluded that imposing constraints on the MI selection strategy does not negatively affect the quality of the testing algorithms10acomputerized adaptive testing10aComputerized classification testing1 aEggen, Theo1 aStraetmans, G J J M uhttp://iacat.org/content/computerized-adaptive-testing-classifying-examinees-three-categories00688nas a2200169 4500008004100000020001400041245015800055210006900213300001200282490000600294653003400300100001700334700001500351700001200366700001400378856012600392 2000 eng d a1575-910500aLos tests adaptativos informatizados en la frontera del siglo XXI: Una revisión [Computerized adaptive tests at the turn of the 21st century: A review]0 aLos tests adaptativos informatizados en la frontera del siglo XX a183-2160 v210acomputerized adaptive testing1 aHontangas, P1 aPonsoda, V1 aOlea, J1 aAbad, F J uhttp://iacat.org/content/los-tests-adaptativos-informatizados-en-la-frontera-del-siglo-xxi-una-revisi%C3%B3n-computerized01195nas a2200133 4500008004100000245008300041210006900124300001200193490000700205520069300212653003400905100002000939856010200959 2000 eng d00aTaylor approximations to logistic IRT models and their use in adaptive testing0 aTaylor approximations to logistic IRT models and their use in ad a307-3430 v253 aTaylor approximation can be used to generate a linear approximation to a logistic ICC and a linear ability estimator. For a specific situation it will be shown to result in a special case of a Robbins-Monro item selection procedure for adaptive testing. The linear estimator can be used for the situation of zero and perfect scores when maximum likelihood estimation fails to come up with a finite estimate. It is also possible to use this estimator to generate starting values for maximum likelihood and weighted likelihood estimation. Approximations to the expectation and variance of the linear estimator for a sequence of Robbins-Monro item selections can be determined analytically. 10acomputerized adaptive testing1 aVeerkamp, W J J uhttp://iacat.org/content/taylor-approximations-logistic-irt-models-and-their-use-adaptive-testing00508nas a2200121 4500008004100000245009600041210006900137300000900206490000700215653003400222100002300256856010700279 1999 eng d00aAlternative methods for the detection of item preknowledge in computerized adaptive testing0 aAlternative methods for the detection of item preknowledge in co a37650 v5910acomputerized adaptive testing1 aMcLeod, Lori Davis uhttp://iacat.org/content/alternative-methods-detection-item-preknowledge-computerized-adaptive-testing01730nas a2200145 4500008004100000245005800041210005700099300001200156490000700168520126300175653003401438100001901472700001201491856008101503 1999 eng d00aa-stratified multistage computerized adaptive testing0 aastratified multistage computerized adaptive testing a211-2220 v233 aFor computerized adaptive tests (CAT) based on the three-parameter logistic mode it was found that administering items with low discrimination parameter (a) values early in the test and administering those with high a values later was advantageous; the skewness of item exposure distributions was reduced while efficiency was maintain in trait level estimation. Thus, a new multistage adaptive testing approach is proposed that factors a into the item selection process. In this approach, the items in the item bank are stratified into a number of levels based on their a values. The early stages of a test use items with lower as and later stages use items with higher as. At each stage, items are selected according to an optimization criterion from the corresponding level. Simulation studies were performed to compare a-stratified CATs with CATs based on the Sympson-Hetter method for controlling item exposure. Results indicated that this new strategy led to tests that were well-balanced, with respect to item exposure, and efficient. The a-stratified CATs achieved a lower average exposure rate than CATs based on Bayesian or information-based item selection and the Sympson-Hetter method. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aChang, Hua-Hua1 aYing, Z uhttp://iacat.org/content/stratified-multistage-computerized-adaptive-testing01272nas a2200145 4500008004100000245004000041210004000081260004600121300001000167520081600177653003400993100002401027700001401051856006101065 1999 eng d00aCAT for certification and licensure0 aCAT for certification and licensure aMahwah, N.J.bLawrence Erlbaum Associates a67-913 a(from the chapter) This chapter discusses implementing computerized adaptive testing (CAT) for high-stakes examinations that determine whether or not a particular candidate will be certified or licensed. The experience of several boards who have chosen to administer their licensure or certification examinations using the principles of CAT illustrates the process of moving into this mode of administration. Examples of the variety of options that can be utilized within a CAT administration are presented, the decisions that boards must make to implement CAT are discussed, and a timetable for completing the tasks that need to be accomplished is provided. In addition to the theoretical aspects of CAT, practical issues and problems are reviewed. (PsycINFO Database Record (c) 2002 APA, all rights reserved).10acomputerized adaptive testing1 aBergstrom, Betty, A1 aLunz, M E uhttp://iacat.org/content/cat-certification-and-licensure00911nas a2200145 4500008004100000245006100041210006000102300001100162490000700173520043400180653003400614100001600648700001600664856008500680 1999 eng d00aComputerized Adaptive Testing: Overview and Introduction0 aComputerized Adaptive Testing Overview and Introduction a187-940 v233 aUse of computerized adaptive testing (CAT) has increased substantially since it was first formulated in the 1970s. This paper provides an overview of CAT and introduces the contributions to this Special Issue. The elements of CAT discussed here include item selection procedures, estimation of the latent trait, item exposure, measurement precision, and item bank development. Some topics for future research are also presented. 10acomputerized adaptive testing1 aMeijer, R R1 aNering, M L uhttp://iacat.org/content/computerized-adaptive-testing-overview-and-introduction01680nas a2200145 4500008004100000245010000041210006900141300001000210490000700220520112900227653003401356100001601390700001501406856011301421 1999 eng d00aThe effect of model misspecification on classification decisions made using a computerized test0 aeffect of model misspecification on classification decisions mad a47-590 v363 aMany computerized testing algorithms require the fitting of some item response theory (IRT) model to examinees' responses to facilitate item selection, the determination of test stopping rules, and classification decisions. Some IRT models are thought to be particularly useful for small volume certification programs that wish to make the transition to computerized adaptive testing (CAT). The 1-parameter logistic model (1-PLM) is usually assumed to require a smaller sample size than the 3-parameter logistic model (3-PLM) for item parameter calibrations. This study examined the effects of model misspecification on the precision of the decisions made using the sequential probability ratio test. For this comparison, the 1-PLM was used to estimate item parameters, even though the items' characteristics were represented by a 3-PLM. Results demonstrate that the 1-PLM produced considerably more decision errors under simulation conditions similar to a real testing environment, compared to the true model and to a fixed-form standard reference set of items. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aKalohn, J C1 aSpray, J A uhttp://iacat.org/content/effect-model-misspecification-classification-decisions-made-using-computerized-test00750nas a2200145 4500008004100000245005500041210005500096300001100151490000700162520028800169653003400457100001600491700001700507856008000524 1999 eng d00aGraphical models and computerized adaptive testing0 aGraphical models and computerized adaptive testing a223-370 v233 aConsiders computerized adaptive testing from the perspective of graphical modeling (GM). GM provides methods for making inferences about multifaceted skills and knowledge and for extracting data from complex performances. Provides examples from language-proficiency assessment. (SLD)10acomputerized adaptive testing1 aAlmond, R G1 aMislevy, R J uhttp://iacat.org/content/graphical-models-and-computerized-adaptive-testing02467nam a2200133 4500008004100000245004300041210004300084260005200127520201600179653003402195100001502229700002402244856006502268 1999 eng d00aInnovations in computerized assessment0 aInnovations in computerized assessment aMahwah, N.J.bLawrence Erlbaum Associates, Inc.3 aChapters in this book present the challenges and dilemmas faced by researchers as they created new computerized assessments, focusing on issues addressed in developing, scoring, and administering the assessments. Chapters are: (1) "Beyond Bells and Whistles; An Introduction to Computerized Assessment" (Julie B. Olson-Buchanan and Fritz Drasgow); (2) "The Development of a Computerized Selection System for Computer Programmers in a Financial Services Company" (Michael J. Zickar, Randall C. Overton, L. Rogers Taylor, and Harvey J. Harms); (3) "Development of the Computerized Adaptive Testing Version of the Armed Services Vocational Aptitude Battery" (Daniel O. Segall and Kathleen E. Moreno); (4) "CAT for Certification and Licensure" (Betty A. Bergstrom and Mary E. Lunz); (5) "Developing Computerized Adaptive Tests for School Children" (G. Gage Kingsbury and Ronald L. Houser); (6) "Development and Introduction of a Computer Adaptive Graduate Record Examinations General Test" (Craig N. Mills); (7) "Computer Assessment Using Visual Stimuli: A Test of Dermatological Skin Disorders" (Terry A. Ackerman, John Evans, Kwang-Seon Park, Claudia Tamassia, and Ronna Turner); (8) "Creating Computerized Adaptive Tests of Music Aptitude: Problems, Solutions, and Future Directions" (Walter P. Vispoel); (9) "Development of an Interactive Video Assessment: Trials and Tribulations" (Fritz Drasgow, Julie B. Olson-Buchanan, and Philip J. Moberg); (10) "Computerized Assessment of Skill for a Highly Technical Job" (Mary Ann Hanson, Walter C. Borman, Henry J. Mogilka, Carol Manning, and Jerry W. Hedge); (11) "Easing the Implementation of Behavioral Testing through Computerization" (Wayne A. Burroughs, Janet Murray, S. Scott Wesley, Debra R. Medina, Stacy L. Penn, Steven R. Gordon, and Michael Catello); and (12) "Blood, Sweat, and Tears: Some Final Comments on Computerized Assessment." (Fritz Drasgow and Julie B. Olson-Buchanan). Each chapter contains references. (Contains 17 tables and 21 figures.) (SLD)10acomputerized adaptive testing1 aDrasgow, F1 aOlson-Buchanan, J B uhttp://iacat.org/content/innovations-computerized-assessment01086nas a2200133 4500008004100000245007800041210006900119300001200188490000700200520059200207653003400799100002300833856009600856 1999 eng d00aMultidimensional adaptive testing with a minimum error-variance criterion0 aMultidimensional adaptive testing with a minimum errorvariance c a398-4120 v243 aAdaptive testing under a multidimensional logistic response model is addressed. An algorithm is proposed that minimizes the (asymptotic) variance of the maximum-likelihood estimator of a linear combination of abilities of interest. The criterion results in a closed-form expression that is easy to evaluate. In addition, it is shown how the algorithm can be modified if the interest is in a test with a "simple ability structure". The statistical properties of the adaptive ML estimator are demonstrated for a two-dimensional item pool with several linear combinations of the abilities. 10acomputerized adaptive testing1 avan der Linden, WJ uhttp://iacat.org/content/multidimensional-adaptive-testing-minimum-error-variance-criterion02642nas a2200133 4500008004100000245007300041210006900114300000900183490000700192520216800199653003402367100001602401856009102417 1999 eng d00aOptimal design for item calibration in computerized adaptive testing0 aOptimal design for item calibration in computerized adaptive tes a42200 v593 aItem Response Theory is the psychometric model used for standardized tests such as the Graduate Record Examination. A test-taker's response to an item is modelled as a binary response with success probability depending on parameters for both the test-taker and the item. Two popular models are the two-parameter logistic (2PL) model and the three-parameter logistic (3PL) model. For the 2PL model, the logit of the probability of a correct response equals ai(theta j-bi), where ai and bi are item parameters, while thetaj is the test-taker's parameter, known as "proficiency." The 3PL model adds a nonzero left asymptote to model random response behavior by low theta test-takers. Assigning scores to students requires accurate estimation of theta s, while accurate estimation of theta s requires accurate estimation of the item parameters. The operational implementation of Item Response Theory, particularly following the advent of computerized adaptive testing, generally involves handling these two estimation problems separately. This dissertation addresses the optimal design for item parameter estimation. Most current designs calibrate items with a sample drawn from the overall test-taking population. For 2PL models a sequential design based on the D-optimality criterion has been proposed, while no 3PL design is in the literature. In this dissertation, we design the calibration with the ultimate use of the items in mind, namely to estimate test-takers' proficiency parameters. For both the 2PL and 3PL models, this criterion leads to a locally L-optimal design criterion, named the Minimal Information Loss criterion. In turn, this criterion and the General Equivalence Theorem give a two point design for the 2PL model and a three point design for the 3PL model. A sequential implementation of this optimal design is presented. For the 2PL model, this design is almost 55% more efficient than the simple random sample approach, and 12% more efficient than the locally D-optimal design. For the 3PL model, the proposed design is 34% more efficient than the simple random sample approach. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aBuyske, S G uhttp://iacat.org/content/optimal-design-item-calibration-computerized-adaptive-testing01186nas a2200157 4500008004100000245010900041210006900150300001200219490000700231520058300238653003400821100002300855700001600878700001800894856011600912 1999 eng d00aUsing response-time constraints to control for differential speededness in computerized adaptive testing0 aUsing responsetime constraints to control for differential speed a195-2100 v233 aAn item-selection algorithm is proposed for neutralizing the differential effects of time limits on computerized adaptive test scores. The method is based on a statistical model for distributions of examinees’ response times on items in a bank that is updated each time an item is administered. Predictions from the model are used as constraints in a 0-1 linear programming model for constrained adaptive testing that maximizes the accuracy of the trait estimator. The method is demonstrated empirically using an item bank from the Armed Services Vocational Aptitude Battery. 10acomputerized adaptive testing1 avan der Linden, WJ1 aScrams, D J1 aSchnipke, D L uhttp://iacat.org/content/using-response-time-constraints-control-differential-speededness-computerized-adaptive01452nas a2200133 4500008004100000245006700041210006700108300000900175490000700184520098800191653003401179100001901213856008601232 1998 eng d00aApplications of network flows to computerized adaptive testing0 aApplications of network flows to computerized adaptive testing a08550 v593 aRecently, the concept of Computerized Adaptive Testing (CAT) has been receiving ever growing attention from the academic community. This is so because of both practical and theoretical considerations. Its practical importance lies in the advantages of CAT over the traditional (perhaps outdated) paper-and-pencil test in terms of time, accuracy, and money. The theoretical interest is sparked by its natural relationship to Item Response Theory (IRT). This dissertation offers a mathematical programming approach which creates a model that generates a CAT that takes care of many questions concerning the test, such as feasibility, accuracy and time of testing, as well as item pool security. The CAT generated is designed to obtain the most information about a single test taker. Several methods for eatimating the examinee's ability, based on the (dichotomous) responses to the items in the test, are also offered here. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aClaudio, M J C uhttp://iacat.org/content/applications-network-flows-computerized-adaptive-testing01236nas a2200157 4500008004100000245006600041210006600107300001000173490000600183520071600189653003400905100001500939700001700954700001900971856008800990 1998 eng d00aMaintaining content validity in computerized adaptive testing0 aMaintaining content validity in computerized adaptive testing a29-410 v33 aThe authors empirically demonstrate some of the trade-offs which can occur when content balancing is imposed in computerized adaptive testing (CAT) forms or conversely, when it is ignored. The authors contend that the content validity of a CAT form can actually change across a score scale when content balancing is ignored. However they caution that, efficiency and score precision can be severely reduced by over specifying content restrictions in a CAT form. The results from 2 simulation studies are presented as a means of highlighting some of the trade-offs that could occur between content and statistical considerations in CAT form assembly. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aLuecht, RM1 aChamplain, A1 aNungester, R J uhttp://iacat.org/content/maintaining-content-validity-computerized-adaptive-testing01435nas a2200145 4500008004100000245005300041210005100094300001200145490000700157520098100164653003401145100002301179700001501202856007201217 1998 eng d00aA model for optimal constrained adaptive testing0 amodel for optimal constrained adaptive testing a259-2700 v223 aA model for constrained computerized adaptive testing is proposed in which the information in the test at the trait level (0) estimate is maximized subject to a number of possible constraints on the content of the test. At each item-selection step, a full test is assembled to have maximum information at the current 0 estimate, fixing the items already administered. Then the item with maximum in-formation is selected. All test assembly is optimal because a linear programming (LP) model is used that automatically updates to allow for the attributes of the items already administered and the new value of the 0 estimator. The LP model also guarantees that each adaptive test always meets the entire set of constraints. A simulation study using a bank of 753 items from the Law School Admission Test showed that the 0 estimator for adaptive tests of realistic lengths did not suffer any loss of efficiency from the presence of 433 constraints on the item selection process. 10acomputerized adaptive testing1 avan der Linden, WJ1 aReese, L M uhttp://iacat.org/content/model-optimal-constrained-adaptive-testing01294nas a2200157 4500008004100000245007500041210006900116300001000185490000700195520076100202653003400963100001800997700001401015700001701029856009001046 1998 eng d00aSimulating the use of disclosed items in computerized adaptive testing0 aSimulating the use of disclosed items in computerized adaptive t a48-680 v353 aRegular use of questions previously made available to the public (i.e., disclosed items) may provide one way to meet the requirement for large numbers of questions in a continuous testing environment, that is, an environment in which testing is offered at test taker convenience throughout the year rather than on a few prespecified test dates. First it must be shown that such use has effects on test scores small enough to be acceptable. In this study simulations are used to explore the use of disclosed items under a worst-case scenario which assumes that disclosed items are always answered correctly. Some item pool and test designs were identified in which the use of disclosed items produces effects on test scores that may be viewed as negligible.10acomputerized adaptive testing1 aStocking, M L1 aWard, W C1 aPotenza, M T uhttp://iacat.org/content/simulating-use-disclosed-items-computerized-adaptive-testing02911nas a2200133 4500008004100000245016300041210006900204300000800273490000700281520232300288653003402611100001402645856011802659 1997 eng d00aA comparison of maximum likelihood estimation and expected a posteriori estimation in computerized adaptive testing using the generalized partial credit model0 acomparison of maximum likelihood estimation and expected a poste a4530 v583 aA simulation study was conducted to investigate the application of expected a posteriori (EAP) trait estimation in computerized adaptive tests (CAT) based on the generalized partial credit model (Muraki, 1992), and to compare the performance of EAP with maximum likelihood trait estimation (MLE). The performance of EAP was evaluated under different conditions: the number of quadrature points (10, 20, and 30), and the type of prior distribution (normal, uniform, negatively skewed, and positively skewed). The relative performance of the MLE and EAP estimation methods were assessed under two distributional forms of the latent trait, one normal and the other negatively skewed. Also, both the known item parameters and estimated item parameters were employed in the simulation study. Descriptive statistics, correlations, scattergrams, accuracy indices, and audit trails were used to compare the different methods of trait estimation in CAT. The results showed that, regardless of the latent trait distribution, MLE and EAP with a normal prior, a uniform prior, or the prior that matches the latent trait distribution using either 20 or 30 quadrature points provided relatively accurate estimation in CAT based on the generalized partial credit model. However, EAP using only 10 quadrature points did not work well in the generalized partial credit CAT. Also, the study found that increasing the number of quadrature points from 20 to 30 did not increase the accuracy of EAP estimation. Therefore, it appears 20 or more quadrature points are sufficient for accurate EAP estimation. The results also showed that EAP with a negatively skewed prior and positively skewed prior performed poorly for the normal data set, and EAP with positively skewed prior did not provide accurate estimates for the negatively skewed data set. Furthermore, trait estimation in CAT using estimated item parameters produced results similar to those obtained using known item parameters. In general, when at least 20 quadrature points are used, EAP estimation with a normal prior, a uniform prior or the prior that matches the latent trait distribution appears to be a good alternative to MLE in the application of polytomous CAT based on the generalized partial credit model. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aChen, S-K uhttp://iacat.org/content/comparison-maximum-likelihood-estimation-and-expected-posteriori-estimation-computerized01682nam a2200145 4500008004100000245006100041210006000102260006200162520115300224653003401377100001501411700001601426700001701442856007701459 1997 eng d00aComputerized adaptive testing: From inquiry to operation0 aComputerized adaptive testing From inquiry to operation aWashington, D.C., USAbAmerican Psychological Association3 a(from the cover) This book traces the development of computerized adaptive testing (CAT) from its origins in the 1960s to its integration with the Armed Services Vocational Aptitude Battery (ASVAB) in the 1990s. A paper-and-pencil version of the battery (P&P-ASVAB) has been used by the Defense Department since the 1970s to measure the abilities of applicants for military service. The test scores are used both for initial qualification and for classification into entry-level training opportunities. /// This volume provides the developmental history of the CAT-ASVAB through its various stages in the Joint-Service arena. Although the majority of the book concerns the myriad technical issues that were identified and resolved, information is provided on various political and funding support challenges that were successfully overcome in developing, testing, and implementing the battery into one of the nation's largest testing programs. The book provides useful information to professionals in the testing community and everyone interested in personnel assessment and evaluation. (PsycINFO Database Record (c) 2004 APA, all rights reserved).10acomputerized adaptive testing1 aSands, W A1 aWaters, B K1 aMcBride, J R uhttp://iacat.org/content/computerized-adaptive-testing-inquiry-operation01795nas a2200169 4500008004100000245014100041210006900182300001200251490000700263520113700270653003401407100001401441700001301455700002101468700001401489856012201503 1997 eng d00aThe effect of population distribution and method of theta estimation on computerized adaptive testing (CAT) using the rating scale model0 aeffect of population distribution and method of theta estimation a422-4390 v573 aInvestigated the effect of population distribution on maximum likelihood estimation (MLE) and expected a posteriori estimation (EAP) in a simulation study of computerized adaptive testing (CAT) based on D. Andrich's (1978) rating scale model. Comparisons were made among MLE and EAP with a normal prior distribution and EAP with a uniform prior distribution within 2 data sets: one generated using a normal trait distribution and the other using a negatively skewed trait distribution. Descriptive statistics, correlations, scattergrams, and accuracy indices were used to compare the different methods of trait estimation. The EAP estimation with a normal prior or uniform prior yielded results similar to those obtained with MLE, even though the prior did not match the underlying trait distribution. An additional simulation study based on real data suggested that more work is needed to determine the optimal number of quadrature points for EAP in CAT based on the rating scale model. The choice between MLE and EAP for particular measurement situations is discussed. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aChen, S-K1 aHou, L Y1 aFitzpatrick, S J1 aDodd, B G uhttp://iacat.org/content/effect-population-distribution-and-method-theta-estimation-computerized-adaptive-testing-cat01810nas a2200169 4500008004100000245005300041210005300094250001000147260006000157300001000217520125400227653003401481100001701515700001601532700001701548856007501565 1997 eng d00aResearch antecedents of applied adaptive testing0 aResearch antecedents of applied adaptive testing axviii aWashington D.C. USAbAmerican Psychological Association a47-573 a(from the chapter) This chapter sets the stage for the entire computerized adaptive testing Armed Services Vocational Aptitude Battery (CAT-ASVAB) development program by describing the state of the art immediately preceding its inception. By the mid-l970s, a great deal of research had been conducted that provided the technical underpinnings needed to develop adaptive tests, but little research had been done to corroborate empirically the promising results of theoretical analyses and computer simulation studies. In this chapter, the author summarizes much of the important theoretical and simulation research prior to 1977. In doing so, he describes a variety of approaches to adaptive testing, and shows that while many methods for adaptive testing had been proposed, few practical attempts had been made to implement it. Furthermore, the few instances of adaptive testing were based primarily on traditional test theory, and were developed in laboratory settings for purposes of basic research. The most promising approaches, those based on item response theory and evaluated analytically or by means of computer simulations, remained to be proven in the crucible of live testing. (PsycINFO Database Record (c) 2004 APA, all rights reserved).10acomputerized adaptive testing1 aMcBride, J R1 aWaters, B K1 aMcBride, J R uhttp://iacat.org/content/research-antecedents-applied-adaptive-testing01402nas a2200133 4500008004100000245008900041210006900130300001200199490000700211520089300218653003401111100001801145856010501163 1997 eng d00aRevising item responses in computerized adaptive tests: A comparison of three models0 aRevising item responses in computerized adaptive tests A compari a129-1420 v213 aInterest in the application of large-scale computerized adaptive testing has focused attention on issues that arise when theoretical advances are made operational. One such issue is that of the order in which exaniinees address questions within a test or separately timed test section. In linear testing, this order is entirely under the control of the examinee, who can look ahead at questions and return and revise answers to questions. Using simulation, this study investigated three models that permit restricted examinee control over revising previous answers in the context of adaptive testing. Even under a worstcase model of examinee revision behavior, two of the models of permitting item revisions worked well in preserving test fairness and accuracy. One model studied may also preserve some cognitive processing styles developed by examinees for a linear testing environment. 10acomputerized adaptive testing1 aStocking, M L uhttp://iacat.org/content/revising-item-responses-computerized-adaptive-tests-comparison-three-models01946nas a2200133 4500008004100000245005600041210005600097260002100153520149700174653003401671100001801705700001701723856007201740 1997 eng d00aValidation of CATSIB To investigate DIF of CAT data0 aValidation of CATSIB To investigate DIF of CAT data aChicago, IL. USA3 aThis paper investigates the performance of CATSIB (a modified version of the SIBTEST computer program) to assess differential item functioning (DIF) in the context of computerized adaptive testing (CAT). One of the distinguishing features of CATSIB is its theoretically built-in regression correction to control for the Type I error rates when the distributions of the reference and focal groups differ on the intended ability. This phenomenon is also called impact. The Type I error rate of CATSIB with the regression correction (WRC) was compared with that of CATSIB without the regression correction (WORC) to see if the regression correction was indeed effective. Also of interest was the power level of CATSIB after the regression correction. The subtest size was set at 25 items, and sample size, the impact level, and the amount of DIF were varied. Results show that the regression correction was very useful in controlling for the Type I error, CATSIB WORC had inflated observed Type I errors, especially when impact levels were high. The CATSIB WRC had observed Type I error rates very close to the nominal level of 0.05. The power rates of CATSIB WRC were impressive. As expected, the power increased as the sample size increased and as the amount of DIF increased. Even for small samples with high impact rates, power rates were 64% or higher for high DIF levels. For large samples, power rates were over 90% for high DIF levels. (Contains 12 tables and 7 references.) (Author/SLD)10acomputerized adaptive testing1 aNandakumar, R1 aRoussos, L A uhttp://iacat.org/content/validation-catsib-investigate-dif-cat-data00557nas a2200121 4500008004100000245011900041210006900160260002100229653003400250653001900284100001400303856011800317 1996 eng d00aA comparison of the traditional maximum information method and the global information method in CAT item selection0 acomparison of the traditional maximum information method and the aNew York, NY USA10acomputerized adaptive testing10aitem selection1 aTang, K L uhttp://iacat.org/content/comparison-traditional-maximum-information-method-and-global-information-method-cat-item02572nas a2200133 4500008004100000245009100041210006900132300000900201490000700210520206600217653003402283100001402317856010702331 1996 eng d00aDynamic scaling: An ipsative procedure using techniques from computer adaptive testing0 aDynamic scaling An ipsative procedure using techniques from comp a58240 v563 aThe purpose of this study was to create a prototype method for scaling items using computer adaptive testing techniques and to demonstrate the method with a working model program. The method can be used to scale items, rank individuals with respect to the scaled items, and to re-scale the items with respect to the individuals' responses. When using this prototype method, the items to be scaled are part of a database that contains not only the items, but measures of how individuals respond to each item. After completion of all presented items, the individual is assigned an overall scale value which is then compared with each item responded to, and an individual "error" term is stored with each item. After several individuals have responded to the items, the item error terms are used to revise the placement of the scaled items. This revision feature allows the natural adaptation of one general list to reflect subgroup differences, for example, differences among geographic areas or ethnic groups. It also provides easy revision and limited authoring of the scale items by the computer program administrator. This study addressed the methodology, the instrumentation needed to handle the scale-item administration, data recording, item error analysis, and scale-item database editing required by the method, and the behavior of a prototype vocabulary test in use. Analyses were made of item ordering, response profiles, item stability, reliability and validity. Although slow, the movement of unordered words used as items in the prototype program was accurate as determined by comparison with an expert word ranking. Person scores obtained by multiple administrations of the prototype test were reliable and correlated at.94 with a commercial paper-and-pencil vocabulary test, while holding a three-to-one speed advantage in administration. Although based upon self-report data, dynamic scaling instruments like the model vocabulary test could be very useful for self-assessment, for pre (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aBerg, S R uhttp://iacat.org/content/dynamic-scaling-ipsative-procedure-using-techniques-computer-adaptive-testing02609nas a2200133 4500008004100000245011400041210006900155300000900224490000700233520206700240653003402307100001602341856011802357 1996 eng d00aThe effect of individual differences variables on the assessment of ability for Computerized Adaptive Testing0 aeffect of individual differences variables on the assessment of a40850 v573 aComputerized Adaptive Testing (CAT) continues to gain momentum as the accepted testing modality for a growing number of certification, licensure, education, government and human resource applications. However, the developers of these tests have for the most part failed to adequately explore the impact of individual differences such as test anxiety on the adaptive testing process. It is widely accepted that non-cognitive individual differences variables interact with the assessment of ability when using written examinations. Logic would dictate that individual differences variables would equally affect CAT. Two studies were used to explore this premise. In the first study, 507 examinees were given a test anxiety survey prior to taking a high stakes certification exam using CAT or using a written format. All examinees had already completed their course of study, and the examination would be their last hurdle prior to being awarded certification. High test anxious examinees performed worse than their low anxious counterparts on both testing formats. The second study replicated the finding that anxiety depresses performance in CAT. It also addressed the differential effect of anxiety on within test performance. Examinees were candidates taking their final certification examination following a four year college program. Ability measures were calculated for each successive part of the test for 923 subjects. Within subject performance varied depending upon test position. High anxious examinees performed poorly at all points in the test, while low and medium anxious examinee performance peaked in the middle of the test. If test anxiety and performance measures were actually the same trait, then low anxious individuals should have performed equally well throughout the test. The observed interaction of test anxiety and time on task serves as strong evidence that test anxiety has motivationally mediated as well as cognitively mediated effects. The results of the studies are di (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aGershon, RC uhttp://iacat.org/content/effect-individual-differences-variables-assessment-ability-computerized-adaptive-testing01606nas a2200133 4500008004100000245009100041210006900132300001200201490000700213520109200220653003401312100001501346856011101361 1996 eng d00aMultidimensional computerized adaptive testing in a certification or licensure context0 aMultidimensional computerized adaptive testing in a certificatio a389-4040 v203 a(from the journal abstract) Multidimensional item response theory (MIRT) computerized adaptive testing, building on a recent work by D. O. Segall (1996), is applied in a licensing/certification context. An example of a medical licensure test is used to demonstrate situations in which complex, integrated content must be balanced at the total test level for validity reasons, but items assigned to reportable subscore categories may be used under a MIRT adaptive paradigm to improve the reliability of the subscores. A heuristic optimization framework is outlined that generalizes to both univariate and multivariate statistical objective functions, with additional systems of constraints included to manage the content balancing or other test specifications on adaptively constructed test forms. Simulation results suggested that a multivariate treatment of the problem, although complicating somewhat the objective function used and the estimation of traits, nonetheless produces advantages from a psychometric perspective. (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aLuecht, RM uhttp://iacat.org/content/multidimensional-computerized-adaptive-testing-certification-or-licensure-context02605nas a2200133 4500008004100000245012000041210006900161300000900230490000700239520205500246653003402301100001602335856012002351 1995 eng d00aAssessment of scaled score consistency in adaptive testing from a multidimensional item response theory perspective0 aAssessment of scaled score consistency in adaptive testing from a55980 v553 aThe purpose of this study was twofold: (a) to examine whether the unidimensional adaptive testing estimates are comparable for different ability levels of examinees when the true examinee-item interaction is correctly modeled using a compensatory multidimensional item response theory (MIRT) model; and (b) to investigate the effects of adaptive testing estimation when the procedure of item selection of computerized adaptive testing (CAT) is controlled by either content-balancing or selecting the most informative item in a user specified direction at the current estimate of unidimensional ability. A series of Monte Carlo simulations were conducted in this study. Deviation from the reference composite angle was used as an index of the theta1,theta2-composite consistency across the different levels of unidimensional CAT estimates. In addition, the effect of the content-balancing item selection procedure and the fixed-direction item selection procedure were compared across the different ability levels. The characteristics of item selection, test information and the relationship between unidimensional and multidimensional models were also investigated. In addition to employing statistical analysis to examine the robustness of the CAT procedure violations of unidimensionality, this research also included graphical analyses to present the results. The results were summarized as follows: (a) the reference angles for the no-control-item-selection method were disparate across the unidimensional ability groups; (b) the unidimensional CAT estimates from the content-balancing item selection method did not offer much improvement; (c) the fixed-direction-item selection method did provide greater consistency for the unidimensional CAT estimates across the different levels of ability; (d) and, increasing the CAT test length did not provide greater score scale consistency. Based on the results of this study, the following conclusions were drawn: (a) without any controlling (PsycINFO Database Record (c) 2003 APA, all rights reserved).10acomputerized adaptive testing1 aFan, Miechu uhttp://iacat.org/content/assessment-scaled-score-consistency-adaptive-testing-multidimensional-item-response-theory00605nas a2200145 4500008004100000245010000041210006900141260004400210300001200254490000600266653003400272100002400306700001400330856011500344 1994 eng d00aThe equivalence of Rasch item calibrations and ability estimates across modes of administration0 aequivalence of Rasch item calibrations and ability estimates acr aNorwood, N.J. USAbAblex Publishing Co. a122-1280 v210acomputerized adaptive testing1 aBergstrom, Betty, A1 aLunz, M E uhttp://iacat.org/content/equivalence-rasch-item-calibrations-and-ability-estimates-across-modes-administration00503nas a2200121 4500008004100000245009300041210006900134300000900203490000700212653003400219100001300253856011500266 1994 eng d00aMonte Carlo simulation comparison of two-stage testing and computerized adaptive testing0 aMonte Carlo simulation comparison of twostage testing and comput a25480 v5410acomputerized adaptive testing1 aKim, H-O uhttp://iacat.org/content/monte-carlo-simulation-comparison-two-stage-testing-and-computerized-adaptive-testing00497nas a2200121 4500008004100000245009700041210006900138300001400207490000700221653003400228100001200262856010100274 1993 eng d00aAn application of Computerized Adaptive Testing to the Test of English as a Foreign Language0 aapplication of Computerized Adaptive Testing to the Test of Engl a4257-42580 v5310acomputerized adaptive testing1 aMoon, O uhttp://iacat.org/content/application-computerized-adaptive-testing-test-english-foreign-language00509nas a2200133 4500008004100000245008100041210006900122300001000191490000700201653003400208100001900242700001600261856009800277 1993 eng d00aAssessing the utility of item response models: computerized adaptive testing0 aAssessing the utility of item response models computerized adapt a21-270 v1210acomputerized adaptive testing1 aKingsbury, G G1 aHouser, R L uhttp://iacat.org/content/assessing-utility-item-response-models-computerized-adaptive-testing00470nas a2200121 4500008004100000245008000041210006900121300000900190490000700199653003400206100001500240856009300255 1993 eng d00aComparability and validity of computerized adaptive testing with the MMPI-20 aComparability and validity of computerized adaptive testing with a37910 v5310acomputerized adaptive testing1 aRoper, B L uhttp://iacat.org/content/comparability-and-validity-computerized-adaptive-testing-mmpi-200572nas a2200121 4500008004100000245015100041210006900192300000900261490000700270653003400277100001700311856012200328 1993 eng d00aComputer adaptive testing: A comparison of four item selection strategies when used with the golden section search strategy for estimating ability0 aComputer adaptive testing A comparison of four item selection st a17720 v5410acomputerized adaptive testing1 aCarlson, R D uhttp://iacat.org/content/computer-adaptive-testing-comparison-four-item-selection-strategies-when-used-golden-section01216nas a2200157 4500008004100000245006600041210006600107300001200173490000600185520069800191653003400889100002400923700001400947700001600961856008100977 1992 eng d00aAltering the level of difficulty in computer adaptive testing0 aAltering the level of difficulty in computer adaptive testing a137-1490 v53 aExamines the effect of altering test difficulty on examinee ability measures and test length in a computer adaptive test. The 225 Ss were randomly assigned to 3 test difficulty conditions and given a variable length computer adaptive test. Examinees in the hard, medium, and easy test condition took a test targeted at the 50%, 60%, or 70% probability of correct response. The results show that altering the probability of a correct response does not affect estimation of examinee ability and that taking an easier computer adaptive test only slightly increases the number of items necessary to reach specified levels of precision. (PsycINFO Database Record (c) 2002 APA, all rights reserved).10acomputerized adaptive testing1 aBergstrom, Betty, A1 aLunz, M E1 aGershon, RC uhttp://iacat.org/content/altering-level-difficulty-computer-adaptive-testing00477nas a2200121 4500008004100000245008100041210006900122300000900191490000700200653003400207100002100241856009300262 1992 eng d00aThe development and evaluation of a system for computerized adaptive testing0 adevelopment and evaluation of a system for computerized adaptive a43040 v5210acomputerized adaptive testing1 aTorre Sanchez, R uhttp://iacat.org/content/development-and-evaluation-system-computerized-adaptive-testing00495nas a2200121 4500008004100000245008200041210006900123300000900192490000700201653003400208100002400242856010700266 1992 eng d00aTest anxiety and test performance under computerized adaptive testing methods0 aTest anxiety and test performance under computerized adaptive te a25180 v5210acomputerized adaptive testing1 aPowell, Zen-Hsiu, E uhttp://iacat.org/content/test-anxiety-and-test-performance-under-computerized-adaptive-testing-methods00580nas a2200121 4500008004100000245016000041210006900201300000900270490000700279653003400286100002000320856011800340 1991 eng d00aA comparison of paper-and-pencil, computer-administered, computerized feedback, and computerized adaptive testing methods for classroom achievement testing0 acomparison of paperandpencil computeradministered computerized f a17190 v5210acomputerized adaptive testing1 aKuan, Tsung Hao uhttp://iacat.org/content/comparison-paper-and-pencil-computer-administered-computerized-feedback-and-computerized00435nas a2200121 4500008004100000245006100041210006000102300001200162490000700174653003400181100001500215856008300230 1991 eng d00aInter-subtest branching in computerized adaptive testing0 aIntersubtest branching in computerized adaptive testing a140-1410 v5210acomputerized adaptive testing1 aChang, S-H uhttp://iacat.org/content/inter-subtest-branching-computerized-adaptive-testing00734nas a2200169 4500008004100000020000900041245012400050210006900174260008700243653003400330653001500364653001800379100001600397700001900413700001700432856011500449 1991 eng d aR-1100aPatterns of alcohol and drug use among federal offenders as assessed by the Computerized Lifestyle Screening Instrument0 aPatterns of alcohol and drug use among federal offenders as asse aOttawa, ON. CanadabResearch and Statistics Branch, Correctional Service of Canada10acomputerized adaptive testing10adrug abuse10asubstance use1 aRobinson, D1 aPorporino, F J1 aMillson, W A uhttp://iacat.org/content/patterns-alcohol-and-drug-use-among-federal-offenders-assessed-computerized-lifestyle01504nas a2200157 4500008004100000245008900041210006900130300001200199490000700211520093700218653003401155100002001189700001401209700001401223856010901237 1990 eng d00aA simulation and comparison of flexilevel and Bayesian computerized adaptive testing0 asimulation and comparison of flexilevel and Bayesian computerize a227-2390 v273 aComputerized adaptive testing (CAT) is a testing procedure that adapts an examination to an examinee's ability by administering only items of appropriate difficulty for the examinee. In this study, the authors compared Lord's flexilevel testing procedure (flexilevel CAT) with an item response theory-based CAT using Bayesian estimation of ability (Bayesian CAT). Three flexilevel CATs, which differed in test length (36, 18, and 11 items), and three Bayesian CATs were simulated; the Bayesian CATs differed from one another in the standard error of estimate (SEE) used for terminating the test (0.25, 0.10, and 0.05). Results showed that the flexilevel 36- and 18-item CATs produced ability estimates that may be considered as accurate as those of the Bayesian CAT with SEE = 0.10 and comparable to the Bayesian CAT with SEE = 0.05. The authors discuss the implications for classroom testing and for item response theory-based CAT.10acomputerized adaptive testing1 ade Ayala, R. J.1 aDodd, B G1 aKoch, W R uhttp://iacat.org/content/simulation-and-comparison-flexilevel-and-bayesian-computerized-adaptive-testing00423nas a2200133 4500008004100000020001400041245005100055210005000106300001000156490000600166653003400172100001700206856006600223 1989 eng d a1745-399200aAdaptive testing: The evolution of a good idea0 aAdaptive testing The evolution of a good idea a11-150 v810acomputerized adaptive testing1 aReckase, M D uhttp://iacat.org/content/adaptive-testing-evolution-good-idea00501nas a2200121 4500008004100000245009800041210006900139300000900208490000700217653003400224100001400258856010700272 1989 eng d00aApplication of computerized adaptive testing to the University Entrance Exam of Taiwan, R.O.C0 aApplication of computerized adaptive testing to the University E a36620 v4910acomputerized adaptive testing1 aHung, P-H uhttp://iacat.org/content/application-computerized-adaptive-testing-university-entrance-exam-taiwan-roc01759nas a2200133 4500008004100000245005400041210005100095260005500146300000800201520129200209653003401501100001701535856007301552 1989 eng d00aAn applied study on computerized adaptive testing0 aapplied study on computerized adaptive testing aGroningen, The NetherlandsbUniversity of Groingen a1853 a(from the cover) The rapid development and falling prices of powerful personal computers, in combination with new test theories, will have a large impact on psychological testing. One of the new possibilities is computerized adaptive testing. During the test administration each item is chosen to be appropriate for the person being tested. The test becomes tailor-made, resolving some of the problems with classical paper-and-pencil tests. In this way individual differences can be measured with higher efficiency and reliability. Scores on other meaningful variables, such as response time, can be obtained easily using computers. /// In this book a study on computerized adaptive testing is described. The study took place at Dutch Railways in an applied setting and served practical goals. Topics discussed include the construction of computerized tests, the use of response time, the choice of algorithms and the implications of using a latent trait model. After running a number of simulations and calibrating the item banks, an experiment was carried out. In the experiment a pretest was administered to a sample of over 300 applicants, followed by an adaptive test. In addition, a survey concerning the attitudes of testees towards computerized testing formed part of the design.10acomputerized adaptive testing1 aSchoonman, W uhttp://iacat.org/content/applied-study-computerized-adaptive-testing01445nas a2200157 4500008004100000245007900041210006900120300001000189490000600199520090200205653003401107100002001141700001701161700001701178856009201195 1989 eng d00aA real-data simulation of computerized adaptive administration of the MMPI0 arealdata simulation of computerized adaptive administration of t a18-220 v13 aA real-data simulation of computerized adaptive administration of the MMPI was conducted with data obtained from two personnel-selection samples and two clinical samples. A modification of the countdown method was tested to determine the usefulness, in terms of item administration savings, of several different test administration procedures. Substantial item administration savings were achieved for all four samples, though the clinical samples required administration of more items to achieve accurate classification and/or full-scale scores than did the personnel-selection samples. The use of normative item endorsement frequencies was found to be as effective as sample-specific frequencies for the determination of item administration order. The role of computerized adaptive testing in the future of personality assessment is discussed., (C) 1989 by the American Psychological Association10acomputerized adaptive testing1 aBen-Porath, Y S1 aSlutske, W S1 aButcher, J N uhttp://iacat.org/content/real-data-simulation-computerized-adaptive-administration-mmpi00529nas a2200121 4500008004100000245010800041210006900149300000900218490000700227653003400234100002000268856011900288 1988 eng d00aComputerized adaptive testing: A comparison of the nominal response model and the three parameter model0 aComputerized adaptive testing A comparison of the nominal respon a31480 v4810acomputerized adaptive testing1 ade Ayala, R. J. uhttp://iacat.org/content/computerized-adaptive-testing-comparison-nominal-response-model-and-three-parameter-model00616nas a2200133 4500008004100000245011200041210006900153260003800222653003400260653003800294100001500332700001700347856011800364 1987 eng d00aThe effect of item parameter estimation error on decisions made using the sequential probability ratio test0 aeffect of item parameter estimation error on decisions made usin aIowa City, IA. USAbDTIC Document10acomputerized adaptive testing10aSequential probability ratio test1 aSpray, J A1 aReckase, M D uhttp://iacat.org/content/effect-item-parameter-estimation-error-decisions-made-using-sequential-probability-ratio01520nas a2200157 4500008004100000020001400041245008900055210006900144300001000213490000700223520095700230653003401187100001801221700002001239856010301259 1986 eng d a0013-164400aAn application of computer adaptive testing with communication handicapped examinees0 aapplication of computer adaptive testing with communication hand a23-350 v463 aThis study was conducted to evaluate a computerized adaptive testing procedure for the measurement of mathematical skills of entry level deaf college students. The theoretical basis of the study was the Rasch model for person measurement. Sixty persons were tested using an Apple II Plus microcomputer. Ability estimates provided by the computerized procedure were compared for stability with those obtained six to eight weeks earlier from conventional (written) testing of the same subject matter. Students' attitudes toward their testing experiences also were measured. Substantial increases in measurement efficiency (by reducing test length) were realized through the adaptive testing procedure. Because the item pool used was not specifically designed for adaptive testing purposes, the psychometric quality of measurements resulting from the different testing methods was approximately equal. Attitudes toward computerized testing were favorable.10acomputerized adaptive testing1 aGarrison, W M1 aBaumgarten, B S uhttp://iacat.org/content/application-computer-adaptive-testing-communication-handicapped-examinees00645nas a2200121 4500008004100000245022600041210006900267300000900336490000700345653003400352100001900386856011800405 1985 eng d00aAdaptive self-referenced testing as a procedure for the measurement of individual change due to instruction: A comparison of the reliabilities of change estimates obtained from conventional and adaptive testing procedures0 aAdaptive selfreferenced testing as a procedure for the measureme a30570 v4510acomputerized adaptive testing1 aKingsbury, G G uhttp://iacat.org/content/adaptive-self-referenced-testing-procedure-measurement-individual-change-due-instruction01566nas a2200169 4500008004100000245013900041210006900180300001200249490000600261520091500267653003401182100001601216700001601232700001701248700001401265856011701279 1984 eng d00aRelationship between corresponding Armed Services Vocational Aptitude Battery (ASVAB) and computerized adaptive testing (CAT) subtests0 aRelationship between corresponding Armed Services Vocational Apt a155-1630 v83 aInvestigated the relationships between selected subtests from the Armed Services Vocational Aptitude Battery (ASVAB) and corresponding subtests administered as computerized adaptive tests (CATs), using 270 17-26 yr old Marine recruits as Ss. Ss were administered the ASVAB before enlisting and approximately 2 wks after entering active duty, and the CAT tests were administered to Ss approximately 24 hrs after arriving at the recruit depot. Results indicate that 3 adaptive subtests correlated as well with ASVAB as did the 2nd administration of the ASVAB, although CAT subtests contained only half the number of items. Factor analysis showed CAT subtests to load on the same factors as the corresponding ASVAB subtests, indicating that the same abilities were being measured. It is concluded that CAT can achieve the same measurement precision as a conventional test, with half the number of items. (16 ref) 10acomputerized adaptive testing1 aMoreno, K E1 aWetzel, C D1 aMcBride, J R1 aWeiss, DJ uhttp://iacat.org/content/relationship-between-corresponding-armed-services-vocational-aptitude-battery-asvab-and00651nas a2200205 4500008004100000020001400041245006700055210006700122300001200189490000700201653003400208653001700242653002100259100001400280700001400294700001900308700001300327700001700340856008800357 1984 eng d a1745-398400aTechnical guidelines for assessing computerized adaptive tests0 aTechnical guidelines for assessing computerized adaptive tests a347-3600 v2110acomputerized adaptive testing10aMode effects10apaper-and-pencil1 aGreen, BF1 aBock, R D1 aHumphreys, L G1 aLinn, RL1 aReckase, M D uhttp://iacat.org/content/technical-guidelines-assessing-computerized-adaptive-tests00567nas a2200121 4500008004100000245015600041210006900197300000900266490000700275653003400282100001500316856011400331 1982 eng d00aAbility measurement, test bias reduction, and psychological reactions to testing as a function of computer adaptive testing versus conventional testing0 aAbility measurement test bias reduction and psychological reacti a42330 v4210acomputerized adaptive testing1 aOrban, J A uhttp://iacat.org/content/ability-measurement-test-bias-reduction-and-psychological-reactions-testing-function