%0 Journal Article %J Journal of Computerized Adaotive Testing %D 2020 %T Three Measures of Test Adaptation Based on Optimal Test Information %A G. Gage Kingsbury %A Steven L. Wise %B Journal of Computerized Adaotive Testing %V 8 %P 1-19 %G English %U http://iacat.org/jcat/index.php/jcat/article/view/80/37 %N 1 %R 10.7333/2002-0801001 %0 Journal Article %J Journal of Computerized Adaotive Testing %D 2020 %T Three Measures of Test Adaptation Based on Optimal Test Information %A G. Gage Kingsbury %A Steven L. Wise %B Journal of Computerized Adaotive Testing %V 8 %P 1-19 %G English %U http://iacat.org/jcat/index.php/jcat/article/view/80/37 %N 1 %R 10.7333/2002-0801001 %0 Journal Article %J Journal of Computerized Adaptive Testing %D 2019 %T Time-Efficient Adaptive Measurement of Change %A Matthew Finkelman %A Chun Wang %K adaptive measurement of change %K computerized adaptive testing %K Fisher information %K item selection %K response-time modeling %X

The adaptive measurement of change (AMC) refers to the use of computerized adaptive testing (CAT) at multiple occasions to efficiently assess a respondent’s improvement, decline, or sameness from occasion to occasion. Whereas previous AMC research focused on administering the most informative item to a respondent at each stage of testing, the current research proposes the use of Fisher information per time unit as an item selection procedure for AMC. The latter procedure incorporates not only the amount of information provided by a given item but also the expected amount of time required to complete it. In a simulation study, the use of Fisher information per time unit item selection resulted in a lower false positive rate in the majority of conditions studied, and a higher true positive rate in all conditions studied, compared to item selection via Fisher information without accounting for the expected time taken. Future directions of research are suggested.

%B Journal of Computerized Adaptive Testing %V 7 %P 15-34 %G English %U http://iacat.org/jcat/index.php/jcat/article/view/73/35 %N 2 %R 10.7333/1909-0702015 %0 Journal Article %J Journal of Educational Measurement %D 2018 %T A Top-Down Approach to Designing the Computerized Adaptive Multistage Test %A Luo, Xiao %A Kim, Doyoung %X Abstract The top-down approach to designing a multistage test is relatively understudied in the literature and underused in research and practice. This study introduced a route-based top-down design approach that directly sets design parameters at the test level and utilizes the advanced automated test assembly algorithm seeking global optimality. The design process in this approach consists of five sub-processes: (1) route mapping, (2) setting objectives, (3) setting constraints, (4) routing error control, and (5) test assembly. Results from a simulation study confirmed that the assembly, measurement and routing results of the top-down design eclipsed those of the bottom-up design. Additionally, the top-down design approach provided unique insights into design decisions that could be used to refine the test. Regardless of these advantages, it is recommended applying both top-down and bottom-up approaches in a complementary manner in practice. %B Journal of Educational Measurement %V 55 %P 243-263 %U https://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.12174 %R 10.1111/jedm.12174 %0 Journal Article %J Journal of Computerized Adaptive Testing %D 2012 %T Termination Criteria in Computerized Adaptive Tests: Do Variable-Length CATs Provide Efficient and Effective Measurement? %A Babcock, B. %A Weiss, D. J. %B Journal of Computerized Adaptive Testing %V 1 %P 1-18 %G English %N 1 %R 10.7333/1212-0101001 %0 Conference Paper %B Annual Conference of the International Association for Computerized Adaptive Testing %D 2011 %T A Test Assembly Model for MST %A Angela Verschoor %A Ingrid Radtke %A Theo Eggen %K CAT %K mst %K multistage testing %K Rasch %K routing %K tif %X

This study is just a short exploration in the matter of optimization of a MST. It is extremely hard or maybe impossible to chart influence of item pool and test specifications on optimization process. Simulations are very helpful in finding an acceptable MST.

%B Annual Conference of the International Association for Computerized Adaptive Testing %8 10/2011 %G eng %0 Book Section %B Elements of Adaptive Testing %D 2010 %T Testlet-Based Adaptive Mastery Testing %A Vos, H. J. %A Glas, C. A. W. %B Elements of Adaptive Testing %P 387-409 %G eng %& 20 %R 10.1007/978-0-387-85461-8 %0 Journal Article %J Papeles del Psicólogo %D 2010 %T Tests informatizados y otros nuevos tipos de tests [Computerized and other new types of tests] %A Olea, J. %A Abad, F. J. %A Barrada, J %X Recientemente se ha producido un considerable desarrollo de los tests adaptativos informatizados, en los que el test se adapta progresivamente al rendimiento del evaluando, y de otros tipos de tests: a) los test basados en modelos (se dispone de un modelo o teoría de cómo se responde a cada ítem, lo que permite predecir su dificultad), b) los tests ipsativos (el evaluado ha de elegir entre opciones que tienen parecida deseabilidad social, por lo que pueden resultar eficaces para controlar algunos sesgos de respuestas), c) los tests conductuales (miden rasgos que ordinariamente se han venido midiendo con autoinformes, mediante tareas que requieren respuestas no verbales) y d) los tests situacionales (en los que se presenta al evaluado una situación de conflicto laboral, por ejemplo, con varias posibles soluciones, y ha de elegir la que le parece la mejor descripción de lo que el haría en esa situación). El artículo comenta las características, ventajas e inconvenientes de todos ellos y muestra algunos ejemplos de tests concretos. Palabras clave: Test adaptativo informatizado, Test situacional, Test comportamental, Test ipsativo y generación automática de ítems.The paper provides a short description of some test types that are earning considerable interest in both research and applied areas. The main feature of a computerized adaptive test is that in despite of the examinees receiving different sets of items, their test scores are in the same metric and can be directly compared. Four other test types are considered: a) model-based tests (a model or theory is available to explain the item response process and this makes the prediction of item difficulties possible), b) ipsative tests (the examinee has to select one among two or more options with similar social desirability; so, these tests can help to control faking or other examinee’s response biases), c) behavioral tests (personality traits are measured from non-verbal responses rather than from self-reports), and d) situational tests (the examinee faces a conflictive situation and has to select the option that best describes what he or she will do). The paper evaluates these types of tests, comments on their pros and cons and provides some specific examples. Key words: Computerized adaptive test, Situational test, Behavioral test, Ipsative test and y automatic item generation. %B Papeles del Psicólogo %V 31 %P 94-107 %G eng %0 Book Section %B Elements of Adaptive Testing %D 2010 %T Three-Category Adaptive Classification Testing %A Theo Eggen %B Elements of Adaptive Testing %P 373-387 %G eng %& 19 %R 10.1007/978-0-387-85461-8 %0 Book Section %D 2009 %T Termination criteria in computerized adaptive tests: Variable-length CATs are not biased. %A Babcock, B. %A Weiss, D. J. %C D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. %0 Book Section %D 2009 %T Test overlap rate and item exposure rate as indicators of test security in CATs %A Barrada, J %A Olea, J. %A Ponsoda, V. %A Abad, F. J. %C D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. %G eng %0 Journal Article %J Psychometrica %D 2008 %T To Weight Or Not To Weight? Balancing Influence Of Initial Items In Adaptive Testing %A Chang, H.-H. %A Ying, Z. %X

It has been widely reported that in computerized adaptive testing some examinees may get much lower scores than they would normally if an alternative paper-and-pencil version were given. The main purpose of this investigation is to quantitatively reveal the cause for the underestimation phenomenon. The logistic models, including the 1PL, 2PL, and 3PL models, are used to demonstrate our assertions. Our analytical derivation shows that, under the maximum information item selection strategy, if an examinee failed a few items at the beginning of the test, easy but more discriminating items are likely to be administered. Such items are ineffective to move the estimate close to the true theta, unless the test is sufficiently long or a variable-length test is used. Our results also indicate that a certain weighting mechanism is necessary to make the algorithm rely less on the items administered at the beginning of the test.

%B Psychometrica %V 73 %P 441-450 %N 3 %R 10.1007/S11336-007-9047-7 %0 Journal Article %J Zeitschrift für Psychologie \ Journal of Psychology %D 2008 %T Transitioning from fixed-length questionnaires to computer-adaptive versions %A Walter, O. B. %A Holling, H. %B Zeitschrift für Psychologie \ Journal of Psychology %V 216(1) %P 22–28 %G eng %0 Journal Article %J Applied Psychological Measurement %D 2007 %T Test design optimization in CAT early stage with the nominal response model %A Passos, V. L. %A Berger, M. P. F. %A Tan, F. E. %K computerized adaptive testing %K nominal response model %K robust performance %K test design optimization %X The early stage of computerized adaptive testing (CAT) refers to the phase of the trait estimation during the administration of only a few items. This phase can be characterized by bias and instability of estimation. In this study, an item selection criterion is introduced in an attempt to lessen this instability: the D-optimality criterion. A polytomous unconstrained CAT simulation is carried out to evaluate this criterion's performance under different test premises. The simulation shows that the extent of early stage instability depends primarily on the quality of the item pool information and its size and secondarily on the item selection criteria. The efficiency of the D-optimality criterion is similar to the efficiency of other known item selection criteria. Yet, it often yields estimates that, at the beginning of CAT, display a more robust performance against instability. (PsycINFO Database Record (c) 2007 APA, all rights reserved) %B Applied Psychological Measurement %I Sage Publications: US %V 31 %P 213-232 %@ 0146-6216 (Print) %G eng %M 2007-06921-005 %0 Journal Article %J Applied Psychological Measurement %D 2007 %T Two-Phase Item Selection Procedure for Flexible Content Balancing in CAT %A Ying Cheng, %A Chang, Hua-Hua %A Qing Yi, %X

Content balancing is an important issue in the design and implementation of computerized adaptive testing (CAT). Content-balancing techniques that have been applied in fixed content balancing, where the number of items from each content area is fixed, include constrained CAT (CCAT), the modified multinomial model (MMM), modified constrained CAT (MCCAT), and others. In this article, four methods are proposed to address the flexible content-balancing issue with the a-stratification design, named STR_C. The four methods are MMM+, an extension of MMM; MCCAT+, an extension of MCCAT; the TPM method, a two-phase content-balancing method using MMM in both phases; and the TPF method, a two-phase content-balancing method using MMM in the first phase and MCCAT in the second. Simulation results show that all of the methods work well in content balancing, and TPF performs the best in item exposure control and item pool utilization while maintaining measurement precision.

%B Applied Psychological Measurement %V 31 %P 467-482 %U http://apm.sagepub.com/content/31/6/467.abstract %R 10.1177/0146621606292933 %0 Journal Article %J Applied Psychological. Measurement %D 2007 %T Two-phase item selection procedure for flexible content balancing in CAT %A Cheng, Y %A Chang, Hua-Hua %A Yi, Q. %B Applied Psychological. Measurement %V 3 %P 467–482 %G eng %0 Journal Article %J Anales de Psicología %D 2006 %T Técnicas para detectar patrones de respuesta atípicos [Aberrant patterns detection methods] %A Núñez, R. M. N. %A Pina, J. A. L. %K aberrant patterns detection %K Classical Test Theory %K generalizability theory %K Item Response %K Item Response Theory %K Mathematics %K methods %K person-fit %K Psychometrics %K psychometry %K Test Validity %K test validity analysis %K Theory %X La identificación de patrones de respuesta atípicos es de gran utilidad para la construcción de tests y de bancos de ítems con propiedades psicométricas así como para el análisis de validez de los mismos. En este trabajo de revisión se han recogido los más relevantes y novedosos métodos de ajuste de personas que se han elaborado dentro de cada uno de los principales ámbitos de trabajo de la Psicometría: el escalograma de Guttman, la Teoría Clásica de Tests (TCT), la Teoría de la Generalizabilidad (TG), la Teoría de Respuesta al Ítem (TRI), los Modelos de Respuesta al Ítem No Paramétricos (MRINP), los Modelos de Clase Latente de Orden Restringido (MCL-OR) y el Análisis de Estructura de Covarianzas (AEC).Aberrant patterns detection has a great usefulness in order to make tests and item banks with psychometric characteristics and validity analysis of tests and items. The most relevant and newest person-fit methods have been reviewed. All of them have been made in each one of main areas of Psychometry: Guttman's scalogram, Classical Test Theory (CTT), Generalizability Theory (GT), Item Response Theory (IRT), Non-parametric Response Models (NPRM), Order-Restricted Latent Class Models (OR-LCM) and Covariance Structure Analysis (CSA). %B Anales de Psicología %V 22 %P 143-154 %@ 0212-9728 %G Spanish %M 2006-07751-018 %0 Journal Article %J Applied Measurement in Education %D 2006 %T A testlet assembly design for the uniform CPA Examination %A Luecht, Richard %A Brumfield, Terry %A Breithaupt, Krista %B Applied Measurement in Education %V 19 %P 189-202 %U http://www.tandfonline.com/doi/abs/10.1207/s15324818ame1903_2 %R 10.1207/s15324818ame1903_2 %0 Journal Article %J Applied Psychological Measurement %D 2005 %T Test construction for cognitive diagnosis %A Henson, R. K. %A Douglas, J. %K (Measurement) %K Cognitive Assessment %K Item Analysis (Statistical) %K Profiles %K Test Construction %K Test Interpretation %K Test Items %X Although cognitive diagnostic models (CDMs) can be useful in the analysis and interpretation of existing tests, little has been developed to specify how one might construct a good test using aspects of the CDMs. This article discusses the derivation of a general CDM index based on Kullback-Leibler information that will serve as a measure of how informative an item is for the classification of examinees. The effectiveness of the index is examined for items calibrated using the deterministic input noisy "and" gate model (DINA) and the reparameterized unified model (RUM) by implementing a simple heuristic to construct a test from an item bank. When compared to randomly constructed tests from the same item bank, the heuristic shows significant improvement in classification rates. (PsycINFO Database Record (c) 2005 APA ) (journal abstract) %B Applied Psychological Measurement %V 29 %P 262-277 %G eng %0 Journal Article %J Alcoholism: Clinical & Experimental Research %D 2005 %T Toward efficient and comprehensive measurement of the alcohol problems continuum in college students: The Brief Young Adult Alcohol Consequences Questionnaire %A Kahler, C. W. %A Strong, D. R. %A Read, J. P. %A De Boeck, P. %A Wilson, M. %A Acton, G. S. %A Palfai, T. P. %A Wood, M. D. %A Mehta, P. D. %A Neale, M. C. %A Flay, B. R. %A Conklin, C. A. %A Clayton, R. R. %A Tiffany, S. T. %A Shiffman, S. %A Krueger, R. F. %A Nichol, P. E. %A Hicks, B. M. %A Markon, K. E. %A Patrick, C. J. %A Iacono, William G. %A McGue, Matt %A Langenbucher, J. W. %A Labouvie, E. %A Martin, C. S. %A Sanjuan, P. M. %A Bavly, L. %A Kirisci, L. %A Chung, T. %A Vanyukov, M. %A Dunn, M. %A Tarter, R. %A Handel, R. W. %A Ben-Porath, Y. S. %A Watt, M. %K Psychometrics %K Substance-Related Disorders %X Background: Although a number of measures of alcohol problems in college students have been studied, the psychometric development and validation of these scales have been limited, for the most part, to methods based on classical test theory. In this study, we conducted analyses based on item response theory to select a set of items for measuring the alcohol problem severity continuum in college students that balances comprehensiveness and efficiency and is free from significant gender bias., Method: We conducted Rasch model analyses of responses to the 48-item Young Adult Alcohol Consequences Questionnaire by 164 male and 176 female college students who drank on at least a weekly basis. An iterative process using item fit statistics, item severities, item discrimination parameters, model residuals, and analysis of differential item functioning by gender was used to pare the items down to those that best fit a Rasch model and that were most efficient in discriminating among levels of alcohol problems in the sample., Results: The process of iterative Rasch model analyses resulted in a final 24-item scale with the data fitting the unidimensional Rasch model very well. The scale showed excellent distributional properties, had items adequately matched to the severity of alcohol problems in the sample, covered a full range of problem severity, and appeared highly efficient in retaining all of the meaningful variance captured by the original set of 48 items., Conclusions: The use of Rasch model analyses to inform item selection produced a final scale that, in both its comprehensiveness and its efficiency, should be a useful tool for researchers studying alcohol problems in college students. To aid interpretation of raw scores, examples of the types of alcohol problems that are likely to be experienced across a range of selected scores are provided., (C)2005Research Society on AlcoholismAn important, sometimes controversial feature of all psychological phenomena is whether they are categorical or dimensional. A conceptual and psychometric framework is described for distinguishing whether the latent structure behind manifest categories (e.g., psychiatric diagnoses, attitude groups, or stages of development) is category-like or dimension-like. Being dimension-like requires (a) within-category heterogeneity and (b) between-category quantitative differences. Being category-like requires (a) within-category homogeneity and (b) between-category qualitative differences. The relation between this classification and abrupt versus smooth differences is discussed. Hybrid structures are possible. Being category-like is itself a matter of degree; the authors offer a formalized framework to determine this degree. Empirical applications to personality disorders, attitudes toward capital punishment, and stages of cognitive development illustrate the approach., (C) 2005 by the American Psychological AssociationThe authors conducted Rasch model ( G. Rasch, 1960) analyses of items from the Young Adult Alcohol Problems Screening Test (YAAPST; S. C. Hurlbut & K. J. Sher, 1992) to examine the relative severity and ordering of alcohol problems in 806 college students. Items appeared to measure a single dimension of alcohol problem severity, covering a broad range of the latent continuum. Items fit the Rasch model well, with less severe symptoms reliably preceding more severe symptoms in a potential progression toward increasing levels of problem severity. However, certain items did not index problem severity consistently across demographic subgroups. A shortened, alternative version of the YAAPST is proposed, and a norm table is provided that allows for a linking of total YAAPST scores to expected symptom expression., (C) 2004 by the American Psychological AssociationA didactic on latent growth curve modeling for ordinal outcomes is presented. The conceptual aspects of modeling growth with ordinal variables and the notion of threshold invariance are illustrated graphically using a hypothetical example. The ordinal growth model is described in terms of 3 nested models: (a) multivariate normality of the underlying continuous latent variables (yt) and its relationship with the observed ordinal response pattern (Yt), (b) threshold invariance over time, and (c) growth model for the continuous latent variable on a common scale. Algebraic implications of the model restrictions are derived, and practical aspects of fitting ordinal growth models are discussed with the help of an empirical example and Mx script ( M. C. Neale, S. M. Boker, G. Xie, & H. H. Maes, 1999). The necessary conditions for the identification of growth models with ordinal data and the methodological implications of the model of threshold invariance are discussed., (C) 2004 by the American Psychological AssociationRecent research points toward the viability of conceptualizing alcohol problems as arrayed along a continuum. Nevertheless, modern statistical techniques designed to scale multiple problems along a continuum (latent trait modeling; LTM) have rarely been applied to alcohol problems. This study applies LTM methods to data on 110 problems reported during in-person interviews of 1,348 middle-aged men (mean age = 43) from the general population. The results revealed a continuum of severity linking the 110 problems, ranging from heavy and abusive drinking, through tolerance and withdrawal, to serious complications of alcoholism. These results indicate that alcohol problems can be arrayed along a dimension of severity and emphasize the relevance of LTM to informing the conceptualization and assessment of alcohol problems., (C) 2004 by the American Psychological AssociationItem response theory (IRT) is supplanting classical test theory as the basis for measures development. This study demonstrated the utility of IRT for evaluating DSM-IV diagnostic criteria. Data on alcohol, cannabis, and cocaine symptoms from 372 adult clinical participants interviewed with the Composite International Diagnostic Interview-Expanded Substance Abuse Module (CIDI-SAM) were analyzed with Mplus ( B. Muthen & L. Muthen, 1998) and MULTILOG ( D. Thissen, 1991) software. Tolerance and legal problems criteria were dropped because of poor fit with a unidimensional model. Item response curves, test information curves, and testing of variously constrained models suggested that DSM-IV criteria in the CIDI-SAM discriminate between only impaired and less impaired cases and may not be useful to scale case severity. IRT can be used to study the construct validity of DSM-IV diagnoses and to identify diagnostic criteria with poor performance., (C) 2004 by the American Psychological AssociationThis study examined the psychometric characteristics of an index of substance use involvement using item response theory. The sample consisted of 292 men and 140 women who qualified for a Diagnostic and Statistical Manual of Mental Disorders (3rd ed., rev.; American Psychiatric Association, 1987) substance use disorder (SUD) diagnosis and 293 men and 445 women who did not qualify for a SUD diagnosis. The results indicated that men had a higher probability of endorsing substance use compared with women. The index significantly predicted health, psychiatric, and psychosocial disturbances as well as level of substance use behavior and severity of SUD after a 2-year follow-up. Finally, this index is a reliable and useful prognostic indicator of the risk for SUD and the medical and psychosocial sequelae of drug consumption., (C) 2002 by the American Psychological AssociationComparability, validity, and impact of loss of information of a computerized adaptive administration of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2) were assessed in a sample of 140 Veterans Affairs hospital patients. The countdown method ( Butcher, Keller, & Bacon, 1985) was used to adaptively administer Scales L (Lie) and F (Frequency), the 10 clinical scales, and the 15 content scales. Participants completed the MMPI-2 twice, in 1 of 2 conditions: computerized conventional test-retest, or computerized conventional-computerized adaptive. Mean profiles and test-retest correlations across modalities were comparable. Correlations between MMPI-2 scales and criterion measures supported the validity of the countdown method, although some attenuation of validity was suggested for certain health-related items. Loss of information incurred with this mode of adaptive testing has minimal impact on test validity. Item and time savings were substantial., (C) 1999 by the American Psychological Association %B Alcoholism: Clinical & Experimental Research %V 29 %P 1180-1189 %G eng %0 Journal Article %J Applied Psychological Measurement %D 2005 %T Trait parameter recovery using multidimensional computerized adaptive testing in reading and mathematics %A Li, Y. H. %X Under a multidimensional item response theory (MIRT) computerized adaptive testing (CAT) testing scenario, a trait estimate (θ) in onedimension will provide clues for subsequentlyseeking a solution in other dimensions. Thisfeature may enhance the efficiency of MIRT CAT’s item selection and its scoring algorithms compared with its counterpart, the unidimensional CAT (UCAT). The present study used existing Reading and Math test data to generate simulated item parameters. A confirmatory item factor analysis model was applied to the data using NOHARM to produce interpretable MIRT item parameters. Results showed that MIRT CAT, conditional on theconstraints, was quite capable of producing accurate estimates on both measures. Compared with UCAT, MIRT CAT slightly increased the accuracy of both trait estimates, especially for the low-level or high-level trait examinees in both measures, and reduced the rate of unused items in the item pool. Index terms: computerized adaptive testing (CAT), item response theory (IRT), dimensionality, 0-1 linear programming, constraints, item exposure, reading assessment, mathematics assessment. %B Applied Psychological Measurement %V 29 %P 3-25 %@ 0146-6216 %G eng %0 Journal Article %J Applied Psychological Measurement %D 2005 %T Trait Parameter Recovery Using Multidimensional Computerized Adaptive Testing in Reading and Mathematics %A Li, Yuan H. %A Schafer, William D. %X

Under a multidimensional item response theory (MIRT) computerized adaptive testing (CAT) testing scenario, a trait estimate (θ) in one dimension will provide clues for subsequently seeking a solution in other dimensions. This feature may enhance the efficiency of MIRT CAT’s item selection and its scoring algorithms compared with its counterpart, the unidimensional CAT (UCAT). The present study used existing Reading and Math test data to generate simulated item parameters. A confirmatory item factor analysis model was applied to the data using NOHARM to produce interpretable MIRT item parameters. Results showed that MIRT CAT, conditional on the constraints, was quite capable of producing accurate estimates on both measures. Compared with UCAT, MIRT CAT slightly increased the accuracy of both trait estimates, especially for the low-level or high-level trait examinees in both measures, and reduced the rate of unused items in the item pool.

%B Applied Psychological Measurement %V 29 %P 3-25 %U http://apm.sagepub.com/content/29/1/3.abstract %R 10.1177/0146621604270667 %0 Journal Article %J Journal of Applied Social Psychology %D 2004 %T Test difficulty and stereotype threat on the GRE General Test %A Stricker, L. J., %A Bejar, I. I. %B Journal of Applied Social Psychology %V 34(3) %P 563-597 %G eng %0 Journal Article %J Language Learning %D 2004 %T Testing vocabulary knowledge: Size, strength, and computer adaptiveness %A Laufer, B. %A Goldstein, Z. %X (from the journal abstract) In this article, we describe the development and trial of a bilingual computerized test of vocabulary size, the number of words the learner knows, and strength, a combination of four aspects of knowledge of meaning that are assumed to constitute a hierarchy of difficulty: passive recognition (easiest), active recognition, passive recall, and active recall (hardest). The participants were 435 learners of English as a second language. We investigated whether the above hierarchy was valid and which strength modality correlated best with classroom language performance. Results showed that the hypothesized hierarchy was present at all word frequency levels, that passive recall was the best predictor of classroom language performance, and that growth in vocabulary knowledge was different for the different strength modalities. (PsycINFO Database Record (c) 2004 APA, all rights reserved). %B Language Learning %V 54 %P 399-436 %8 Sep %G eng %0 Journal Article %J Annals of Internal Medicine %D 2003 %T Ten recommendations for advancing patient-centered outcomes measurement for older persons %A McHorney, C. A. %K *Health Status Indicators %K Aged %K Geriatric Assessment/*methods %K Humans %K Patient-Centered Care/*methods %K Research Support, U.S. Gov't, Non-P.H.S. %X The past 50 years have seen great progress in the measurement of patient-based outcomes for older populations. Most of the measures now used were created under the umbrella of a set of assumptions and procedures known as classical test theory. A recent alternative for health status assessment is item response theory. Item response theory is superior to classical test theory because it can eliminate test dependency and achieve more precise measurement through computerized adaptive testing. Computerized adaptive testing reduces test administration times and allows varied and precise estimates of ability. Several key challenges must be met before computerized adaptive testing becomes a productive reality. I discuss these challenges for the health assessment of older persons in the form of 10 "Ds": things we need to deliberate, debate, decide, and do. %B Annals of Internal Medicine %V 139 %P 403-409 %8 Sep 2 %G eng %M 12965966 %0 Conference Paper %B Paper presented at the Annual meeting of the National Council on Measurement in Education %D 2003 %T Test information targeting strategies for adaptive multistage testlet designs %A Luecht, RM %A Burgin, W. L. %B Paper presented at the Annual meeting of the National Council on Measurement in Education %C Chicago IL %G eng %0 Generic %D 2003 %T Tests adaptativos informatizados (Computerized adaptive testing) %A Olea, J. %A Ponsoda, V. %C Madrid: UNED Ediciones %G eng %0 Conference Paper %B Paper presented at the Annual meeting of the National Council on Measurement in Education %D 2003 %T Test-score comparability, ability estimation, and item-exposure control in computerized adaptive testing %A Chang, Hua-Hua %A Ying, Z. %B Paper presented at the Annual meeting of the National Council on Measurement in Education %C Chicago IL %G eng %0 Journal Article %J Zeitschrift für Differentielle und Diagnostische Psychologie %D 2003 %T Timing behavior in computerized adaptive testing: Response times for correct and incorrect answers are not related to general fluid intelligence/Zum Zeitverhalten beim computergestützten adaptiveb Testen: Antwortlatenzen bei richtigen und falschen Lösun %A Rammsayer, Thomas %A Brandler, Susanne %K Adaptive Testing %K Cognitive Ability %K Intelligence %K Perception %K Reaction Time computerized adaptive testing %X Examined the effects of general fluid intelligence on item response times for correct and false responses in computerized adaptive testing. After performing the CFT3 intelligence test, 80 individuals (aged 17-44 yrs) completed perceptual and cognitive discrimination tasks. Results show that response times were related neither to the proficiency dimension reflected by the task nor to the individual level of fluid intelligence. Furthermore, the false > correct-phenomenon as well as substantial positive correlations between item response times for false and correct responses were shown to be independent of intelligence levels. (PsycINFO Database Record (c) 2005 APA ) %B Zeitschrift für Differentielle und Diagnostische Psychologie %V 24 %P 57-63 %G eng %0 Conference Paper %B Paper presented at the Annual meeting of the National Council on Measurement in Education %D 2003 %T To stratify or not: An investigation of CAT item selection procedures under practical constraints %A Deng, H. %A Ansley, T. %B Paper presented at the Annual meeting of the National Council on Measurement in Education %C Chicago IL %G eng %0 Journal Article %J School Administrator %D 2002 %T Technology solutions for testing %A Olson, A. %X Northwest Evaluation Association in Portland, Oregon, consults with state and local educators on assessment issues. Describes several approaches in place at school districts that are using some combination of computer-based tests to measure student growth. The computerized adaptive test adjusts items based on a student's answer in "real time." On-demand testing provides almost instant scoring. (MLF) %B School Administrator %V 4 %P 20-23 %G eng %M EJ642970 %0 Book Section %D 2002 %T Test models for complex computer-based testing %A Luecht, RM %A Clauser, B. E. %C C. N. Mille,. M. T. Potenza, J. J. Fremer, and W. C. Ward (Eds.). Computer-based testing: Building the foundation for future assessments (pp. 67-88). Hillsdale NJ: Erlbaum. %G eng %0 Conference Paper %B Paper presented at the Annual Meeting of the National Council on Measurement in Education. %D 2002 %T A testlet assembly design for the uniform CPA examination %A Luecht, RM %A Brumfield, T. %A Breithaupt, K %B Paper presented at the Annual Meeting of the National Council on Measurement in Education. %C New Orleans %G eng %0 Conference Paper %B Paper presented at the annual meeting of the National Council on Measurement in Education %D 2002 %T To weight or not to weight – balancing influence of initial and later items in CAT %A Chang, Hua-Hua %A Ying, Z. %B Paper presented at the annual meeting of the National Council on Measurement in Education %C New Orleans LA %G eng %0 Journal Article %J Journal of educational computing research %D 2001 %T Test anxiety and test performance: Comparing paper-based and computer-adaptive versions of the Graduate Record Examinations (GRE) General test %A Powers, D. E. %B Journal of educational computing research %V 24 %P 249-273. %G eng %N 3 %0 Conference Paper %B Paper presented at the annual meeting of the American Psychological Association %D 2001 %T Testing a computerized adaptive personality inventory using simulated response data %A Simms, L. %B Paper presented at the annual meeting of the American Psychological Association %C San Francisco CA %G eng %0 Generic %D 2001 %T Testing via the Internet: A literature review and analysis of issues for Department of Defense Internet testing of the Armed Services Vocational Aptitude Battery (ASVAB) in high schools (FR-01-12) %A J. R. McBride %A Paddock, A. F. %A Wise, L. L. %A Strickland, W. J. %A B. K. Waters %C Alexandria VA: Human Resources Research Organization %G eng %0 Journal Article %J Nederlands Tijdschrift voor de Psychologie en haar Grensgebieden %D 2001 %T Toepassing van een computergestuurde adaptieve testprocedure op persoonlijkheidsdata [Application of a computerised adaptive test procedure on personality data] %A Hol, A. M. %A Vorst, H. C. M. %A Mellenbergh, G. J. %K Adaptive Testing %K Computer Applications %K Computer Assisted Testing %K Personality Measures %K Test Reliability computerized adaptive testing %X Studied the applicability of a computerized adaptive testing procedure to an existing personality questionnaire within the framework of item response theory. The procedure was applied to the scores of 1,143 male and female university students (mean age 21.8 yrs) in the Netherlands on the Neuroticism scale of the Amsterdam Biographical Questionnaire (G. J. Wilde, 1963). The graded response model (F. Samejima, 1969) was used. The quality of the adaptive test scores was measured based on their correlation with test scores for the entire item bank and on their correlation with scores on other scales from the personality test. The results indicate that computerized adaptive testing can be applied to personality scales. (PsycINFO Database Record (c) 2005 APA ) %B Nederlands Tijdschrift voor de Psychologie en haar Grensgebieden %V 56 %P 119-133 %G eng %0 Journal Article %J Journal of Educational and Behavioral Statistics %D 2000 %T Taylor approximations to logistic IRT models and their use in adaptive testing %A Veerkamp, W. J. J. %K computerized adaptive testing %X Taylor approximation can be used to generate a linear approximation to a logistic ICC and a linear ability estimator. For a specific situation it will be shown to result in a special case of a Robbins-Monro item selection procedure for adaptive testing. The linear estimator can be used for the situation of zero and perfect scores when maximum likelihood estimation fails to come up with a finite estimate. It is also possible to use this estimator to generate starting values for maximum likelihood and weighted likelihood estimation. Approximations to the expectation and variance of the linear estimator for a sequence of Robbins-Monro item selections can be determined analytically. %B Journal of Educational and Behavioral Statistics %V 25 %P 307-343 %G eng %M EJ620787 %0 Conference Paper %B Paper presented at the annual meeting of the National Council on Measurement in Educatio %D 2000 %T Test security and item exposure control for computer-based %A Kalohn, J. %B Paper presented at the annual meeting of the National Council on Measurement in Educatio %C Chicago %G eng %0 Conference Paper %B Paper presented at the National Council on Measurement in Education invited symposium: Maintaining test security in computerized programs–Implications for practice %D 2000 %T Test security and the development of computerized tests %A Guo, F. %A Way, W. D. %A Reshetar, R. %B Paper presented at the National Council on Measurement in Education invited symposium: Maintaining test security in computerized programs–Implications for practice %C New Orleans %G eng %0 Book Section %D 2000 %T Testlet response theory: An analog for the 3PL model useful in testlet-based adaptive testing %A Wainer, H., %A Bradlow, E. T. %A Du, Z. %C W. J. van der Linden and C. A. W. Glas (Eds.), Computerized Adaptive Testing: Theory and Practice (pp. 245-270). Norwell MA: Kluwer. %G eng %0 Book Section %D 2000 %T Testlet-based adaptive mastery testing, W %A Vos, H. J. %A Glas, C. A. W. %C J. van der Linden (Ed.), Computerized adaptive testing: Theory and practice (pp. 289-309). Norwell MA: Kluwer. %G eng %0 Generic %D 2000 %T Testlet-based Designs for Computer-Based Testing in a Certification and Licensure Setting %A Pitoniak, M. J. %C Jersey City, NJ: AICPA Technical Report %G eng %0 Generic %D 1999 %T Test anxiety and test performance: Comparing paper-based and computer-adaptive versions of the GRE General Test (Research Report 99-15) %A Powers, D. E. %C Princeton NJ: Educational Testing Service %G eng %0 Book Section %D 1999 %T Testing adaptatif et évaluation des processus cognitifs %A Laurier, M. %C C. Depover and B. Noël (Éds) : L’évaluation des compétences et des processus cognitifs - Modèles, pratiques et contextes. Bruxelles : De Boeck Université. %G eng %0 Generic %D 1999 %T Tests informatizados: Fundamentos y aplicaciones (Computerized testing: Fundamentals and applications %A Olea, J. %A Ponsoda, V. %A Prieto, G., Eds. %C Madrid: Pirmide. %G eng %0 Conference Paper %B Paper presented at the annual meeting of the National Council on Measurement in Education %D 1999 %T Test-taking strategies %A Steffen, M. %B Paper presented at the annual meeting of the National Council on Measurement in Education %C Montreal, Canada %G eng %0 Conference Paper %B Paper presented at the annual meeting of the National Council on Measurement in Education %D 1999 %T Test-taking strategies in computerized adaptive testing %A Steffen, M. %A Way, W. D. %B Paper presented at the annual meeting of the National Council on Measurement in Education %C Montreal, Canada %G eng %0 Journal Article %J Educational Assessment %D 1999 %T Threats to score comparability with applications to performance assessments and computerized adaptive tests %A Kolen, M. J. %X Develops a conceptual framework that addresses score comparability for performance assessments, adaptive tests, paper-and-pencil tests, and alternate item pools for computerized tests. Outlines testing situation aspects that might threaten score comparability and describes procedures for evaluating the degree of score comparability. Suggests ways to minimize threats to comparability. (SLD) %B Educational Assessment %V 6 %P 73-96 %G eng %M EJ604330 %0 Journal Article %J Educational Assessment %D 1999 %T Threats to score comparability with applications to performance assessments and computerized adaptive tests %A Kolen, M. J. %B Educational Assessment %V 6 %P 73-96 %G eng %0 Conference Paper %B Paper presented at the annual meeting of the National Council on Measurement in Education %D 1998 %T Test development exposure control for adaptive testing %A Parshall, C. G. %A Davey, T. %A Nering, M. L. %B Paper presented at the annual meeting of the National Council on Measurement in Education %C San Diego, CA %G eng %0 Journal Article %J Intelligence %D 1998 %T Testing word knowledge by telephone to estimate general cognitive aptitude using an adaptive test %A Legree, P. J. %A Fischl, M. A %A Gade, P. A. %A Wilson, M. %B Intelligence %V 26 %P 91-98 %G eng %0 Generic %D 1998 %T Three response types for broadening the conception of mathematical problem solving in computerized-adaptive tests (Research Report 98-45) %A Bennett, R. E. %A Morley, M. %A Quardt, D. %C Princeton NJ : Educational Testing Service %G eng %0 Book Section %D 1997 %T Technical perspective %A J. R. McBride %C W. A. Sands, B. K. Waters, and J. R. McBride (Eds.), Computerized adaptive testing: From inquiry to operation (pp. 29-44). Washington, DC: American Psychological Association. %G eng %0 Book Section %B Psicometría %D 1996 %T Test adaptativos informatizados [Computerized adaptive testing] %A Olea, J. %A Ponsoda, V. %B Psicometría %I Universitas %C Madrid, UNED %G eng %0 Conference Paper %B Paper presented at the annual meeting of the Psychometric Society %D 1996 %T A Type I error rate study of a modified SIBTEST DIF procedure with potential application to computerized adaptive tests %A Roussos, L. %B Paper presented at the annual meeting of the Psychometric Society %C Alberta Canada %G eng %0 Conference Paper %B Paper presented at the Fourth Symposium de Metodologia de las Ciencies del Comportamiento %D 1995 %T Tests adaptivos y autoadaptados informatizados: Effects en la ansiedad y en la pecision de las estimaciones [SATs and CATS: Effects on enxiety and estimate precision] %A Olea, J. %A Ponsoda, V. %A Wise, S. L. %B Paper presented at the Fourth Symposium de Metodologia de las Ciencies del Comportamiento %C Murcia, Spain %G Spanish %0 Journal Article %J Applied Psychological Measurement %D 1995 %T Theoretical results and item selection from multidimensional item bank in the Mokken IRT model for polytomous items %A Hemker, B. T. %A Sijtsma, K. %A Molenaar, I. W. %B Applied Psychological Measurement %V 19 %P 337–352 %G eng %0 Generic %D 1994 %T Three practical issues for modern adaptive testing item pools (Research Report 94-5), %A Stocking, M. L. %C Princeton NJ: Educational Testing Service %G eng %0 Conference Paper %B Paper presented at the annual meeting of the National Council on Measurement in Education %D 1993 %T Test targeting and precision before and after review on computer-adaptive tests %A Lunz, M. E. %A Stahl, J. A. %A Bergstrom, Betty A. %B Paper presented at the annual meeting of the National Council on Measurement in Education %C Atlanta GA %G eng %0 Conference Paper %B Richmond IN: Indiana University. (ERIC Document Reproduction Service No. ED 334910 and/or TM018223). Paper presented at the annual meeting of the American Educational Research Association %D 1992 %T Test anxiety and test performance under computerized adaptive testing methods %A Powell, Z. E. %B Richmond IN: Indiana University. (ERIC Document Reproduction Service No. ED 334910 and/or TM018223). Paper presented at the annual meeting of the American Educational Research Association %C San Francisco CA %G eng %0 Journal Article %J Dissertation Abstracts International %D 1992 %T Test anxiety and test performance under computerized adaptive testing methods %A Powell, Zen-Hsiu E. %K computerized adaptive testing %B Dissertation Abstracts International %V 52 %P 2518 %G eng %0 Book Section %D 1990 %T Testing algorithms %A Wainer, H., %A Mislevy, R. J. %C H. Wainer (Ed.), Computerized adaptive testing: A primer (pp. 103-135). Hillsdale NJ: Erlbaum. %G eng %0 Book Section %D 1990 %T Testing algorithms %A Thissen, D. %A Mislevy, R. J. %C H. Wainer (Ed.), Computerized adaptive testing: A primer (pp. 103-135). Hillsdale NJ: Erlbaum. %G eng %0 Conference Proceedings %B annual meeting of the National Council on Measurement in Education %D 1990 %T Test-retest consistency of computer adaptive tests. %A Lunz, M. E. %A Bergstrom, Betty A. %A Gershon, R. C. %B annual meeting of the National Council on Measurement in Education %C Boston, MA USA %8 04/1990 %G eng %0 Journal Article %J Journal of Educational Measurement %D 1990 %T Toward a psychometrics for testlets %A Wainer, H., %A Lewis, C. %B Journal of Educational Measurement %V 27 %P 1-14 %G eng %0 Journal Article %J Journal of Personality Assessment %D 1989 %T Tailored interviewing: An application of item response theory for personality measurement %A Kamakura, W. A., %A Balasubramanian, S. K. %B Journal of Personality Assessment %V 53 %P 502-519 %G eng %0 Journal Article %J . Educational Measurement: Issues and Practice %D 1989 %T Testing software review: MicroCAT Version 3 %A Stone, C. A. %B . Educational Measurement: Issues and Practice %V 8 (3) %P 33-38 %G eng %0 Journal Article %J Journal of Educational Measurement %D 1989 %T Trace lines for testlets: A use of multiple-categorical-response models %A Thissen, D. %A Steinberg, L. %A Mooney, J.A. %B Journal of Educational Measurement %V 26 %P 247-260 %G eng %0 Journal Article %J Applied Psychology: An International Review %D 1987 %T Two simulated feasibility studies in computerized adaptive testing %A Stocking, M. L. %B Applied Psychology: An International Review %V 36 %P 263-277 %G eng %0 Journal Article %J Journal of Educational Measurement %D 1984 %T Technical guidelines for assessing computerized adaptive tests %A Green, B. F. %A Bock, R. D. %A Humphreys, L. G. %A Linn, R. L. %A Reckase, M. D. %K computerized adaptive testing %K Mode effects %K paper-and-pencil %B Journal of Educational Measurement %V 21 %P 347-360 %@ 1745-3984 %G eng %0 Generic %D 1984 %T Two simulated feasibility studies in computerized adaptive testing (RR-84-15) %A Stocking, M. L. %C Princeton NJ: Educational Testing Service %G eng %0 Generic %D 1983 %T Tailored testing, its theory and practice. Part I: The basic model, the normal ogive submodels, and the tailored testing algorithm (NPRDC TR-83-00) %A Urry, V. W. %A Dorans, N. J. %C San Diego CA: Navy Personnel Research and Development Center %G eng %0 Generic %D 1981 %T Tailored testing, its theory and practice. Part II: Ability and item parameter estimation, multiple ability application, and allied procedures (NPRDC TR-81) %A Urry, V. W. %C San Diego CA: Navy Personnel Research and Development Center %G eng %0 Journal Article %J Educational and Psychological Measurement %D 1977 %T TAILOR: A FORTRAN procedure for interactive tailored testing %A Cudeck, R. A. %A Cliff, N. A. %A Kehoe, J. %B Educational and Psychological Measurement %V 37 %P 767-769 %G eng %0 Journal Article %J Educational and Psychological Measurement %D 1977 %T TAILOR-APL: An interactive computer program for individual tailored testing %A McCormick, D. %A Cliff, N. A. %B Educational and Psychological Measurement %V 37 %P 771-774 %G eng %0 Generic %D 1977 %T Tailored testing: A spectacular success for latent trait theory (TS 77-2) %A Urry, V. W. %C Washington DC: U. S. Civil Service Commission, Personnel Research and Development Center %G eng %0 Journal Article %J Journal of Educational Measurement %D 1977 %T Tailored testing: A successful application of latent trait theory %A Urry, V. W. %B Journal of Educational Measurement %V 14 %P 181-196 %G eng %0 Journal Article %J Psychometrika %D 1977 %T A theory of consistency ordering generalizable to tailored testing %A Cliff, N. A. %B Psychometrika %P 375-399 %G eng %0 Generic %D 1977 %T A two-stage testing procedure (Memorandum 403-77) %A de Gruijter, D. N. M. %C University of Leyden, The Netherlands, Educational Research Center %G eng %0 Generic %D 1976 %T Test theory and the public interest %A Lord, F. M., %C Proceedings of the Educational Testing Service Invitational Conference %G eng %0 Conference Paper %B Paper presented at the 86th Annual Convention of the American Psychological Association. Toronto %D 1975 %T Tailored testing: Maximizing validity and utility for job selection %A Croll, P. R. %A Urry, V. W. %B Paper presented at the 86th Annual Convention of the American Psychological Association. Toronto %C Canada %G eng %0 Journal Article %J Journal of Computer-Based Instruction %D 1974 %T A tailored testing model employing the beta distribution and conditional difficulties %A Kalisch, S. J. %B Journal of Computer-Based Instruction %V 1 %P 22-28 %G eng %0 Generic %D 1974 %T A tailored testing model employing the beta distribution (unpublished manuscript) %A Kalisch, S. J. %C Florida State University, Educational Evaluation and Research Design Program %G eng %0 Conference Paper %B Paper presented at the 18th International Congress of Applied Psychology %D 1974 %T A tailored testing system for selection and allocation in the British Army %A Killcross, M. C. %B Paper presented at the 18th International Congress of Applied Psychology %C Montreal Canada %G eng %0 Journal Article %J Review of Educational Research %D 1974 %T Testing and decision-making procedures for selected individualized instruction programs %A Hambleton, R. K. %B Review of Educational Research %V 10 %P 371-400 %G eng %0 Journal Article %J Journal of Computer-Based Instruction %D 1973 %T A tailored testing model employing the beta distribution and conditional difficulties %A Kalisch, S. J. %B Journal of Computer-Based Instruction %V 1 %P 111-120 %G eng %0 Generic %D 1971 %T Tailored testing: An application of stochastic approximation (RM 71-2) %A Lord, F. M., %C Princeton NJ: Educational Testing Service %G eng %0 Journal Article %J Journal of the American Statistical Association %D 1971 %T Tailored testing, an approximation of stochastic approximation %A Lord, F. M., %B Journal of the American Statistical Association %V 66 %P 707-711 %G eng %0 Journal Article %J Educational and Psychological Measurement %D 1971 %T A theoretical study of the measurement effectiveness of flexilevel tests %A Lord, F. M., %B Educational and Psychological Measurement %V 31 %P 805-813 %G eng %0 Journal Article %J Psychometrika %D 1971 %T A theoretical study of two-stage testing %A Lord, F. M., %B Psychometrika %V 36 %P 227-242 %G eng