TY - CHAP T1 - Multistage Testing: Issues, Designs, and Research T2 - Elements of Adaptive Testing Y1 - 2010 A1 - Zenisky, A. L. A1 - Hambleton, R. K. A1 - Luecht, RM JF - Elements of Adaptive Testing ER - TY - JOUR T1 - Measuring global physical health in children with cerebral palsy: Illustration of a multidimensional bi-factor model and computerized adaptive testing JF - Quality of Life Research Y1 - 2009 A1 - Haley, S. M. A1 - Ni, P. A1 - Dumas, H. M. A1 - Fragala-Pinkham, M. A. A1 - Hambleton, R. K. A1 - Montpetit, K. A1 - Bilodeau, N. A1 - Gorton, G. E. A1 - Watson, K. A1 - Tucker, C. A. KW - *Computer Simulation KW - *Health Status KW - *Models, Statistical KW - Adaptation, Psychological KW - Adolescent KW - Cerebral Palsy/*physiopathology KW - Child KW - Child, Preschool KW - Factor Analysis, Statistical KW - Female KW - Humans KW - Male KW - Massachusetts KW - Pennsylvania KW - Questionnaires KW - Young Adult AB - PURPOSE: The purposes of this study were to apply a bi-factor model for the determination of test dimensionality and a multidimensional CAT using computer simulations of real data for the assessment of a new global physical health measure for children with cerebral palsy (CP). METHODS: Parent respondents of 306 children with cerebral palsy were recruited from four pediatric rehabilitation hospitals and outpatient clinics. We compared confirmatory factor analysis results across four models: (1) one-factor unidimensional; (2) two-factor multidimensional (MIRT); (3) bi-factor MIRT with fixed slopes; and (4) bi-factor MIRT with varied slopes. We tested whether the general and content (fatigue and pain) person score estimates could discriminate across severity and types of CP, and whether score estimates from a simulated CAT were similar to estimates based on the total item bank, and whether they correlated as expected with external measures. RESULTS: Confirmatory factor analysis suggested separate pain and fatigue sub-factors; all 37 items were retained in the analyses. From the bi-factor MIRT model with fixed slopes, the full item bank scores discriminated across levels of severity and types of CP, and compared favorably to external instruments. CAT scores based on 10- and 15-item versions accurately captured the global physical health scores. CONCLUSIONS: The bi-factor MIRT CAT application, especially the 10- and 15-item versions, yielded accurate global physical health scores that discriminated across known severity groups and types of CP, and correlated as expected with concurrent measures. The CATs have potential for collecting complex data on the physical health of children with CP in an efficient manner. VL - 18 SN - 0962-9343 (Print)0962-9343 (Linking) N1 - Haley, Stephen MNi, PengshengDumas, Helene MFragala-Pinkham, Maria AHambleton, Ronald KMontpetit, KathleenBilodeau, NathalieGorton, George EWatson, KyleTucker, Carole AK02 HD045354-01A1/HD/NICHD NIH HHS/United StatesK02 HD45354-01A1/HD/NICHD NIH HHS/United StatesResearch Support, N.I.H., ExtramuralResearch Support, Non-U.S. Gov'tNetherlandsQuality of life research : an international journal of quality of life aspects of treatment, care and rehabilitationQual Life Res. 2009 Apr;18(3):359-70. Epub 2009 Feb 17. U2 - 2692519 ER - TY - JOUR T1 - Psychometric evaluation and calibration of health-related quality of life item banks: plans for the Patient-Reported Outcomes Measurement Information System (PROMIS) JF - Medical Care Y1 - 2007 A1 - Reeve, B. B. A1 - Hays, R. D. A1 - Bjorner, J. B. A1 - Cook, K. F. A1 - Crane, P. K. A1 - Teresi, J. A. A1 - Thissen, D. A1 - Revicki, D. A. A1 - Weiss, D. J. A1 - Hambleton, R. K. A1 - Liu, H. A1 - Gershon, R. C. A1 - Reise, S. P. A1 - Lai, J. S. A1 - Cella, D. KW - *Health Status KW - *Information Systems KW - *Quality of Life KW - *Self Disclosure KW - Adolescent KW - Adult KW - Aged KW - Calibration KW - Databases as Topic KW - Evaluation Studies as Topic KW - Female KW - Humans KW - Male KW - Middle Aged KW - Outcome Assessment (Health Care)/*methods KW - Psychometrics KW - Questionnaires/standards KW - United States AB - BACKGROUND: The construction and evaluation of item banks to measure unidimensional constructs of health-related quality of life (HRQOL) is a fundamental objective of the Patient-Reported Outcomes Measurement Information System (PROMIS) project. OBJECTIVES: Item banks will be used as the foundation for developing short-form instruments and enabling computerized adaptive testing. The PROMIS Steering Committee selected 5 HRQOL domains for initial focus: physical functioning, fatigue, pain, emotional distress, and social role participation. This report provides an overview of the methods used in the PROMIS item analyses and proposed calibration of item banks. ANALYSES: Analyses include evaluation of data quality (eg, logic and range checking, spread of response distribution within an item), descriptive statistics (eg, frequencies, means), item response theory model assumptions (unidimensionality, local independence, monotonicity), model fit, differential item functioning, and item calibration for banking. RECOMMENDATIONS: Summarized are key analytic issues; recommendations are provided for future evaluations of item banks in HRQOL assessment. VL - 45 SN - 0025-7079 (Print) N1 - Reeve, Bryce BHays, Ron DBjorner, Jakob BCook, Karon FCrane, Paul KTeresi, Jeanne AThissen, DavidRevicki, Dennis AWeiss, David JHambleton, Ronald KLiu, HonghuGershon, RichardReise, Steven PLai, Jin-sheiCella, DavidPROMIS Cooperative GroupAG015815/AG/United States NIAResearch Support, N.I.H., ExtramuralUnited StatesMedical careMed Care. 2007 May;45(5 Suppl 1):S22-31. ER - TY - JOUR T1 - Computer adaptive testing improved accuracy and precision of scores over random item selection in a physical functioning item bank JF - Journal of Clinical Epidemiology Y1 - 2006 A1 - Haley, S. M. A1 - Ni, P. A1 - Hambleton, R. K. A1 - Slavin, M. D. A1 - Jette, A. M. KW - *Recovery of Function KW - Activities of Daily Living KW - Adolescent KW - Adult KW - Aged KW - Aged, 80 and over KW - Confidence Intervals KW - Factor Analysis, Statistical KW - Female KW - Humans KW - Male KW - Middle Aged KW - Outcome Assessment (Health Care)/*methods KW - Rehabilitation/*standards KW - Reproducibility of Results KW - Software AB - BACKGROUND AND OBJECTIVE: Measuring physical functioning (PF) within and across postacute settings is critical for monitoring outcomes of rehabilitation; however, most current instruments lack sufficient breadth and feasibility for widespread use. Computer adaptive testing (CAT), in which item selection is tailored to the individual patient, holds promise for reducing response burden, yet maintaining measurement precision. We calibrated a PF item bank via item response theory (IRT), administered items with a post hoc CAT design, and determined whether CAT would improve accuracy and precision of score estimates over random item selection. METHODS: 1,041 adults were interviewed during postacute care rehabilitation episodes in either hospital or community settings. Responses for 124 PF items were calibrated using IRT methods to create a PF item bank. We examined the accuracy and precision of CAT-based scores compared to a random selection of items. RESULTS: CAT-based scores had higher correlations with the IRT-criterion scores, especially with short tests, and resulted in narrower confidence intervals than scores based on a random selection of items; gains, as expected, were especially large for low and high performing adults. CONCLUSION: The CAT design may have important precision and efficiency advantages for point-of-care functional assessment in rehabilitation practice settings. VL - 59 SN - 0895-4356 (Print) N1 - Haley, Stephen MNi, PengshengHambleton, Ronald KSlavin, Mary DJette, Alan MK02 hd45354-01/hd/nichdR01 hd043568/hd/nichdComparative StudyResearch Support, N.I.H., ExtramuralResearch Support, U.S. Gov't, Non-P.H.S.EnglandJournal of clinical epidemiologyJ Clin Epidemiol. 2006 Nov;59(11):1174-82. Epub 2006 Jul 11. ER - TY - JOUR T1 - Computer adaptive testing improved accuracy and precision of scores over random item selection in a physical functioning item bank JF - Journal of Clinical Epidemiology Y1 - 2006 A1 - Haley, S. A1 - Ni, P. A1 - Hambleton, R. K. A1 - Slavin, M. A1 - Jette, A. VL - 59 SN - 08954356 ER - TY - JOUR T1 - Optimal and nonoptimal computer-based test designs for making pass-fail decisions JF - Applied Measurement in Education Y1 - 2006 A1 - Hambleton, R. K. A1 - Xing, D. KW - adaptive test KW - credentialing exams KW - Decision Making KW - Educational Measurement KW - multistage tests KW - optimal computer-based test designs KW - test form AB - Now that many credentialing exams are being routinely administered by computer, new computer-based test designs, along with item response theory models, are being aggressively researched to identify specific designs that can increase the decision consistency and accuracy of pass-fail decisions. The purpose of this study was to investigate the impact of optimal and nonoptimal multistage test (MST) designs, linear parallel-form test designs (LPFT), and computer adaptive test (CAT) designs on the decision consistency and accuracy of pass-fail decisions. Realistic testing situations matching those of one of the large credentialing agencies were simulated to increase the generalizability of the findings. The conclusions were clear: (a) With the LPFTs, matching test information functions (TIFs) to the mean of the proficiency distribution produced slightly better results than matching them to the passing score; (b) all of the test designs worked better than test construction using random selection of items, subject to content constraints only; (c) CAT performed better than the other test designs; and (d) if matching a TIP to the passing score, the MST design produced a bit better results than the LPFT design. If an argument for the MST design is to be made, it can be made on the basis of slight improvements over the LPFT design and better expected item bank utilization, candidate preference, and the potential for improved diagnostic feedback, compared with the feedback that is possible with fixed linear test forms. (PsycINFO Database Record (c) 2007 APA, all rights reserved) PB - Lawrence Erlbaum: US VL - 19 SN - 0895-7347 (Print); 1532-4818 (Electronic) ER - TY - CHAP T1 - Applications of item response theory to improve health outcomes assessment: Developing item banks, linking instruments, and computer-adaptive testing T2 - Outcomes assessment in cancer Y1 - 2005 A1 - Hambleton, R. K. ED - C. C. Gotay ED - C. Snyder KW - Computer Assisted Testing KW - Health KW - Item Response Theory KW - Measurement KW - Test Construction KW - Treatment Outcomes AB - (From the chapter) The current chapter builds on Reise's introduction to the basic concepts, assumptions, popular models, and important features of IRT and discusses the applications of item response theory (IRT) modeling to health outcomes assessment. In particular, we highlight the critical role of IRT modeling in: developing an instrument to match a study's population; linking two or more instruments measuring similar constructs on a common metric; and creating item banks that provide the foundation for tailored short-form instruments or for computerized adaptive assessments. (PsycINFO Database Record (c) 2005 APA ) JF - Outcomes assessment in cancer PB - Cambridge University Press CY - Cambridge, UK N1 - Using Smart Source ParsingOutcomes assessment in cancer: Measures, methods, and applications. (pp. 445-464). New York, NY : Cambridge University Press. xiv, 662 pp ER - TY - ABST T1 - Computer-based test designs with optimal and non-optimal tests for making pass-fail decisions Y1 - 2004 A1 - Hambleton, R. K. A1 - Xing, D. CY - Research Report, University of Massachusetts, Amherst, MA ER - TY - CONF T1 - Detecting exposed test items in computer-based testing T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2004 A1 - Han, N. A1 - Hambleton, R. K. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - San Diego CA N1 - {PDF file, 1.245 MB} ER - TY - CONF T1 - Investigating the effects of selected multi-stage test design alternatives on credentialing outcomes T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2004 A1 - Zenisky, A. L. A1 - Hambleton, R. K. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - San Diego CA N1 - {PDF file, 129 KB} ER - TY - JOUR T1 - Statistics for detecting disclosed items in a CAT environment JF - Metodologia de Las Ciencias del Comportamiento. Y1 - 2004 A1 - Lu, Y., A1 - Hambleton, R. K. VL - 5 IS - 2 ER - TY - JOUR T1 - Small sample estimation in dichotomous item response models: Effect of priors based on judgmental information on the accuracy of item parameter estimates JF - Applied Psychological Measurement Y1 - 2003 A1 - Swaminathan, H. A1 - Hambleton, R. K. A1 - Sireci, S. G. A1 - Xing, D. A1 - Rizavi, S. M. AB - Large item banks with properly calibrated test items are essential for ensuring the validity of computer-based tests. At the same time, item calibrations with small samples are desirable to minimize the amount of pretesting and limit item exposure. Bayesian estimation procedures show considerable promise with small examinee samples. The purposes of the study were (a) to examine how prior information for Bayesian item parameter estimation can be specified and (b) to investigate the relationship between sample size and the specification of prior information on the accuracy of item parameter estimates. The results of the simulation study were clear: Estimation of item response theory (IRT) model item parameters can be improved considerably. Improvements in the one-parameter model were modest; considerable improvements with the two- and three-parameter models were observed. Both the study of different forms of priors and ways to improve the judgmental data used in forming the priors appear to be promising directions for future research. VL - 27 N1 - Sage Publications, US ER - TY - CONF T1 - Comparison of the psychometric properties of several computer-based test designs for credentialing exams T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2002 A1 - Jodoin, M. A1 - Zenisky, A. L. A1 - Hambleton, R. K. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans LA N1 - {PDF file, 261 KB} ER - TY - CONF T1 - Impact of item quality and item bank size on the psychometric quality of computer-based credentialing exams T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2002 A1 - Hambleton, R. K. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans LA ER - TY - CONF T1 - Impact of selected factors on the psychometric quality of credentialing examinations administered with a sequential testlet design T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2002 A1 - Hambleton, R. K. A1 - Jodoin, M. A1 - Zenisky, A. L. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans LA ER - TY - CONF T1 - Impact of test design, item quality and item bank size on the psychometric properties of computer-based credentialing exams T2 - Paper presented at the meeting of National Council on Measurement in Education Y1 - 2002 A1 - Xing, D. A1 - Hambleton, R. K. JF - Paper presented at the meeting of National Council on Measurement in Education CY - New Orleans N1 - PDF file, 500 K ER - TY - ABST T1 - Impact of several computer-based testing variables on the psychometric properties of credentialing examinations (Laboratory of Psychometric and Evaluative Research Report No 393) Y1 - 2001 A1 - Xing, D. A1 - Hambleton, R. K. CY - Amherst, MA: University of Massachusetts, School of Education. ER - TY - CONF T1 - Impact of several computer-based testing variables on the psychometric properties of credentialing examinations T2 - Paper presented at the Annual Meeting of the National Council on Measurement in Education Y1 - 2001 A1 - Xing, D. A1 - Hambleton, R. K. JF - Paper presented at the Annual Meeting of the National Council on Measurement in Education CY - Seattle WA ER - TY - JOUR T1 - Emergence of item response modeling in instrument development and data analysis JF - Medical Care Y1 - 2000 A1 - Hambleton, R. K. KW - Computer Assisted Testing KW - Health KW - Item Response Theory KW - Measurement KW - Statistical Validity computerized adaptive testing KW - Test Construction KW - Treatment Outcomes VL - 38 ER - TY - CONF T1 - A comparative study of ability estimates from computer-adaptive testing and multi-stage testing T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1999 A1 - Patsula, L N. A1 - Hambleton, R. K. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Montreal Canada ER - TY - CHAP T1 - Computerized adaptive testing: Theory, applications, and standards Y1 - 1991 A1 - Hambleton, R. K. A1 - Zaal, J. N. A1 - Pieters, J. P. M. CY - R. K. Hambleton and J. N. Zaal (Eds.), Advances in educational and psychological testing: Theory and Applications (pp. 341-366). Boston: Kluwer. ER - TY - CHAP T1 - Adaptive Testing Applied to Hierarchically Structured Objectives-Based Programs Y1 - 1977 A1 - Hambleton, R. K. A1 - Eignor, D. R. CY - D. J. Weiss (Ed.), Proceedings of the 1977 Computerized Adaptive Testing Conference. Minneapolis MN: University of Minnesota, Department of Psychology, Psychometric Methods Program ER - TY - JOUR T1 - A computer simulation study of tailored testing strategies for objective-based instructional programs JF - Educational and Psychological Measurement Y1 - 1977 A1 - Spineti, J. P. A1 - Hambleton, R. K. AB - One possible way of reducing the amount of time spent testing in . objective-based instructional programs would involve the implementation of a tailored testing strategy. Our purpose was to provide some additional data on the effectiveness of various tailored testing strategies for different testing situations. The three factors of a tailored testing strategy under study with various hypothetical distributions of abilities across two learning hierarchies were test length, mastery cutting score, and starting point. Overall, our simulation results indicate that it is possible to obtain a reduction of more than 50% in testing time without any loss in decision-making accuracy, when compared to a conventional testing procedure, by implementing a tailored testing strategy. In addition, our study of starting points revealed that it was generally best to begin testing in the middle of the learning hierarchy. Finally we observed a 40% reduction in errors of classification as the number of items for testing each objective was increased from one to five. VL - 37 ER - TY - JOUR T1 - Testing and decision-making procedures for selected individualized instruction programs JF - Review of Educational Research Y1 - 1974 A1 - Hambleton, R. K. VL - 10 ER - TY - ABST T1 - A review of testing and decision-making procedures (Technical Bulletin No. 15 Y1 - 1973 A1 - Hambleton, R. K. CY - Iowa City IA: American College Testing Program. ER -