TY - JOUR T1 - The use of PROMIS and assessment center to deliver patient-reported outcome measures in clinical research JF - Journal of Applied Measurement Y1 - 2010 A1 - Gershon, R. C. A1 - Rothrock, N. A1 - Hanrahan, R. A1 - Bass, M. A1 - Cella, D. AB - The Patient-Reported Outcomes Measurement Information System (PROMIS) was developed as one of the first projects funded by the NIH Roadmap for Medical Research Initiative to re-engineer the clinical research enterprise. The primary goal of PROMIS is to build item banks and short forms that measure key health outcome domains that are manifested in a variety of chronic diseases which could be used as a "common currency" across research projects. To date, item banks, short forms and computerized adaptive tests (CAT) have been developed for 13 domains with relevance to pediatric and adult subjects. To enable easy delivery of these new instruments, PROMIS built a web-based resource (Assessment Center) for administering CATs and other self-report data, tracking item and instrument development, monitoring accrual, managing data, and storing statistical analysis results. Assessment Center can also be used to deliver custom researcher developed content, and has numerous features that support both simple and complicated accrual designs (branching, multiple arms, multiple time points, etc.). This paper provides an overview of the development of the PROMIS item banks and details Assessment Center functionality. VL - 11 SN - 1529-7713 ER - TY - JOUR T1 - The future of outcomes measurement: item banking, tailored short-forms, and computerized adaptive assessment JF - Quality of Life Research Y1 - 2007 A1 - Cella, D. A1 - Gershon, R. C. A1 - Lai, J-S. A1 - Choi, S. W. AB - The use of item banks and computerized adaptive testing (CAT) begins with clear definitions of important outcomes, and references those definitions to specific questions gathered into large and well-studied pools, or “banks” of items. Items can be selected from the bank to form customized short scales, or can be administered in a sequence and length determined by a computer programmed for precision and clinical relevance. Although far from perfect, such item banks can form a common definition and understanding of human symptoms and functional problems such as fatigue, pain, depression, mobility, social function, sensory function, and many other health concepts that we can only measure by asking people directly. The support of the National Institutes of Health (NIH), as witnessed by its cooperative agreement with measurement experts through the NIH Roadmap Initiative known as PROMIS (www.nihpromis.org), is a big step in that direction. Our approach to item banking and CAT is practical; as focused on application as it is on science or theory. From a practical perspective, we frequently must decide whether to re-write and retest an item, add more items to fill gaps (often at the ceiling of the measure), re-test a bank after some modifications, or split up a bank into units that are more unidimensional, yet less clinically relevant or complete. These decisions are not easy, and yet they are rarely unforgiving. We encourage people to build practical tools that are capable of producing multiple short form measures and CAT administrations from common banks, and to further our understanding of these banks with various clinical populations and ages, so that with time the scores that emerge from these many activities begin to have not only a common metric and range, but a shared meaning and understanding across users. In this paper, we provide an overview of item banking and CAT, discuss our approach to item banking and its byproducts, describe testing options, discuss an example of CAT for fatigue, and discuss models for long term sustainability of an entity such as PROMIS. Some barriers to success include limitations in the methods themselves, controversies and disagreements across approaches, and end-user reluctance to move away from the familiar. VL - 16 SN - 0962-9343 ER - TY - JOUR T1 - The Patient-Reported Outcomes Measurement Information System (PROMIS): progress of an NIH Roadmap cooperative group during its first two years JF - Medical Care Y1 - 2007 A1 - Cella, D. A1 - Yount, S. A1 - Rothrock, N. A1 - Gershon, R. C. A1 - Cook, K. F. A1 - Reeve, B. A1 - Ader, D. A1 - Fries, J.F. A1 - Bruce, B. A1 - Rose, M. AB - The National Institutes of Health (NIH) Patient-Reported Outcomes Measurement Information System (PROMIS) Roadmap initiative (www.nihpromis.org) is a 5-year cooperative group program of research designed to develop, validate, and standardize item banks to measure patient-reported outcomes (PROs) relevant across common medical conditions. In this article, we will summarize the organization and scientific activity of the PROMIS network during its first 2 years. VL - 45 ER - TY - JOUR T1 - Psychometric evaluation and calibration of health-related quality of life item banks: plans for the Patient-Reported Outcomes Measurement Information System (PROMIS) JF - Medical Care Y1 - 2007 A1 - Reeve, B. B. A1 - Hays, R. D. A1 - Bjorner, J. B. A1 - Cook, K. F. A1 - Crane, P. K. A1 - Teresi, J. A. A1 - Thissen, D. A1 - Revicki, D. A. A1 - Weiss, D. J. A1 - Hambleton, R. K. A1 - Liu, H. A1 - Gershon, R. C. A1 - Reise, S. P. A1 - Lai, J. S. A1 - Cella, D. KW - *Health Status KW - *Information Systems KW - *Quality of Life KW - *Self Disclosure KW - Adolescent KW - Adult KW - Aged KW - Calibration KW - Databases as Topic KW - Evaluation Studies as Topic KW - Female KW - Humans KW - Male KW - Middle Aged KW - Outcome Assessment (Health Care)/*methods KW - Psychometrics KW - Questionnaires/standards KW - United States AB - BACKGROUND: The construction and evaluation of item banks to measure unidimensional constructs of health-related quality of life (HRQOL) is a fundamental objective of the Patient-Reported Outcomes Measurement Information System (PROMIS) project. OBJECTIVES: Item banks will be used as the foundation for developing short-form instruments and enabling computerized adaptive testing. The PROMIS Steering Committee selected 5 HRQOL domains for initial focus: physical functioning, fatigue, pain, emotional distress, and social role participation. This report provides an overview of the methods used in the PROMIS item analyses and proposed calibration of item banks. ANALYSES: Analyses include evaluation of data quality (eg, logic and range checking, spread of response distribution within an item), descriptive statistics (eg, frequencies, means), item response theory model assumptions (unidimensionality, local independence, monotonicity), model fit, differential item functioning, and item calibration for banking. RECOMMENDATIONS: Summarized are key analytic issues; recommendations are provided for future evaluations of item banks in HRQOL assessment. VL - 45 SN - 0025-7079 (Print) N1 - Reeve, Bryce BHays, Ron DBjorner, Jakob BCook, Karon FCrane, Paul KTeresi, Jeanne AThissen, DavidRevicki, Dennis AWeiss, David JHambleton, Ronald KLiu, HonghuGershon, RichardReise, Steven PLai, Jin-sheiCella, DavidPROMIS Cooperative GroupAG015815/AG/United States NIAResearch Support, N.I.H., ExtramuralUnited StatesMedical careMed Care. 2007 May;45(5 Suppl 1):S22-31. ER - TY - JOUR T1 - Item banks and their potential applications to health status assessment in diverse populations JF - Medical Care Y1 - 2006 A1 - Hahn, E. A. A1 - Cella, D. A1 - Bode, R. K. A1 - Gershon, R. C. A1 - Lai, J. S. AB - In the context of an ethnically diverse, aging society, attention is increasingly turning to health-related quality of life measurement to evaluate healthcare and treatment options for chronic diseases. When evaluating and treating symptoms and concerns such as fatigue, pain, or physical function, reliable and accurate assessment is a priority. Modern psychometric methods have enabled us to move from long, static tests that provide inefficient and often inaccurate assessment of individual patients, to computerized adaptive tests (CATs) that can precisely measure individuals on health domains of interest. These modern methods, collectively referred to as item response theory (IRT), can produce calibrated "item banks" from larger pools of questions. From these banks, CATs can be conducted on individuals to produce their scores on selected domains. Item banks allow for comparison of patients across different question sets because the patient's score is expressed on a common scale. Other advantages of using item banks include flexibility in terms of the degree of precision desired; interval measurement properties under most circumstances; realistic capability for accurate individual assessment over time (using CAT); and measurement equivalence across different patient populations. This work summarizes the process used in the creation and evaluation of item banks and reviews their potential contributions and limitations regarding outcome assessment and patient care, particularly when they are applied across people of different cultural backgrounds. VL - 44 N1 - 0025-7079 (Print)Journal ArticleResearch Support, N.I.H., ExtramuralResearch Support, Non-U.S. Gov't ER - TY - JOUR T1 - Computer adaptive testing JF - Journal of Applied Measurement Y1 - 2005 A1 - Gershon, R. C. VL - 6 ER - TY - JOUR T1 - Computer adaptive testing JF - Journal of Applied Measurement Y1 - 2005 A1 - Gershon, R. C. KW - *Internet KW - *Models, Statistical KW - *User-Computer Interface KW - Certification KW - Health Surveys KW - Humans KW - Licensure KW - Microcomputers KW - Quality of Life AB - The creation of item response theory (IRT) and Rasch models, inexpensive accessibility to high speed desktop computers, and the growth of the Internet, has led to the creation and growth of computerized adaptive testing or CAT. This form of assessment is applicable for both high stakes tests such as certification or licensure exams, as well as health related quality of life surveys. This article discusses the historical background of CAT including its many advantages over conventional (typically paper and pencil) alternatives. The process of CAT is then described including descriptions of the specific differences of using CAT based upon 1-, 2- and 3-parameter IRT and various Rasch models. Numerous specific topics describing CAT in practice are described including: initial item selection, content balancing, test difficulty, test length and stopping rules. The article concludes with the author's reflections regarding the future of CAT. VL - 6 SN - 1529-7713 (Print) N1 - Gershon, Richard CReviewUnited StatesJournal of applied measurementJ Appl Meas. 2005;6(1):109-27. ER - TY - JOUR T1 - An item bank was created to improve the measurement of cancer-related fatigue JF - Journal of Clinical Epidemiology Y1 - 2005 A1 - Lai, J-S. A1 - Cella, D. A1 - Dineen, K. A1 - Bode, R. A1 - Von Roenn, J. A1 - Gershon, R. C. A1 - Shevrin, D. KW - Adult KW - Aged KW - Aged, 80 and over KW - Factor Analysis, Statistical KW - Fatigue/*etiology/psychology KW - Female KW - Humans KW - Male KW - Middle Aged KW - Neoplasms/*complications/psychology KW - Psychometrics KW - Questionnaires AB - OBJECTIVE: Cancer-related fatigue (CRF) is one of the most common unrelieved symptoms experienced by patients. CRF is underrecognized and undertreated due to a lack of clinically sensitive instruments that integrate easily into clinics. Modern computerized adaptive testing (CAT) can overcome these obstacles by enabling precise assessment of fatigue without requiring the administration of a large number of questions. A working item bank is essential for development of a CAT platform. The present report describes the building of an operational item bank for use in clinical settings with the ultimate goal of improving CRF identification and treatment. STUDY DESIGN AND SETTING: The sample included 301 cancer patients. Psychometric properties of items were examined by using Rasch analysis, an Item Response Theory (IRT) model. RESULTS AND CONCLUSION: The final bank includes 72 items. These 72 unidimensional items explained 57.5% of the variance, based on factor analysis results. Excellent internal consistency (alpha=0.99) and acceptable item-total correlation were found (range: 0.51-0.85). The 72 items covered a reasonable range of the fatigue continuum. No significant ceiling effects, floor effects, or gaps were found. A sample short form was created for demonstration purposes. The resulting bank is amenable to the development of a CAT platform. VL - 58 SN - 0895-4356 (Print)0895-4356 (Linking) N1 - Lai, Jin-SheiCella, DavidDineen, KellyBode, RitaVon Roenn, JamieGershon, Richard CShevrin, DanielEnglandJ Clin Epidemiol. 2005 Feb;58(2):190-7. ER - TY - CHAP T1 - The ABCs of Computerized Adaptive Testing Y1 - 2004 A1 - Gershon, R. C. CY - T. M. Wood and W. Zhi (Eds.), Measurement issues and practice in physical activity. Champaign, IL: Human kinetics. ER - TY - JOUR T1 - CAT administration of language placement examinations JF - Journal of Applied Measurement Y1 - 2000 A1 - Stahl, J. A1 - Bergstrom, B. A1 - Gershon, R. C. KW - *Language KW - *Software KW - Aptitude Tests/*statistics & numerical data KW - Educational Measurement/*statistics & numerical data KW - Humans KW - Psychometrics KW - Reproducibility of Results KW - Research Support, Non-U.S. Gov't AB - This article describes the development of a computerized adaptive test for Cegep de Jonquiere, a community college located in Quebec, Canada. Computerized language proficiency testing allows the simultaneous presentation of sound stimuli as the question is being presented to the test-taker. With a properly calibrated bank of items, the language proficiency test can be offered in an adaptive framework. By adapting the test to the test-taker's level of ability, an assessment can be made with significantly fewer items. We also describe our initial attempt to detect instances in which "cheating low" is occurring. In the "cheating low" situation, test-takers deliberately answer questions incorrectly, questions that they are fully capable of answering correctly had they been taking the test honestly. VL - 1 N1 - 1529-7713Journal Article ER - TY - CONF T1 - CAT administration of language placement exams T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1999 A1 - Stahl, J. A1 - Gershon, R. C. A1 - Bergstrom, B. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Montreal, Canada ER - TY - JOUR T1 - The effect of individual differences variables on the assessment of ability for Computerized Adaptive Testing JF - Dissertation Abstracts International: Section B: the Sciences & Engineering Y1 - 1996 A1 - Gershon, R. C. KW - computerized adaptive testing AB - Computerized Adaptive Testing (CAT) continues to gain momentum as the accepted testing modality for a growing number of certification, licensure, education, government and human resource applications. However, the developers of these tests have for the most part failed to adequately explore the impact of individual differences such as test anxiety on the adaptive testing process. It is widely accepted that non-cognitive individual differences variables interact with the assessment of ability when using written examinations. Logic would dictate that individual differences variables would equally affect CAT. Two studies were used to explore this premise. In the first study, 507 examinees were given a test anxiety survey prior to taking a high stakes certification exam using CAT or using a written format. All examinees had already completed their course of study, and the examination would be their last hurdle prior to being awarded certification. High test anxious examinees performed worse than their low anxious counterparts on both testing formats. The second study replicated the finding that anxiety depresses performance in CAT. It also addressed the differential effect of anxiety on within test performance. Examinees were candidates taking their final certification examination following a four year college program. Ability measures were calculated for each successive part of the test for 923 subjects. Within subject performance varied depending upon test position. High anxious examinees performed poorly at all points in the test, while low and medium anxious examinee performance peaked in the middle of the test. If test anxiety and performance measures were actually the same trait, then low anxious individuals should have performed equally well throughout the test. The observed interaction of test anxiety and time on task serves as strong evidence that test anxiety has motivationally mediated as well as cognitively mediated effects. The results of the studies are di (PsycINFO Database Record (c) 2003 APA, all rights reserved). VL - 57 ER - TY - CONF T1 - Does cheating on CAT pay: Not T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1995 A1 - Gershon, R. C. A1 - Bergstrom, B. JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Francisco N1 - ERIC ED 392 844 ER - TY - BOOK T1 - CAT software system [computer program Y1 - 1994 A1 - Gershon, R. C. CY - Chicago IL: Computer Adaptive Technologies ER - TY - JOUR T1 - Computer adaptive testing JF - International journal of Educational Research Y1 - 1994 A1 - Lunz, M. E. A1 - Bergstrom, Betty A. A1 - Gershon, R. C. VL - 6 ER - TY - CONF T1 - Computerized adaptive testing exploring examinee response time using hierarchical linear modeling T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1994 A1 - Bergstrom, B. A1 - Gershon, R. C. JF - Paper presented at the annual meeting of the American Educational Research Association CY - New Orleans LA N1 - ERIC No. ED 400 286). ER - TY - JOUR T1 - Computerized adaptive testing for licensure and certification JF - CLEAR Exam Review Y1 - 1994 A1 - Bergstrom, Betty A. A1 - Gershon, R. C. VL - Winter 1994 ER - TY - JOUR T1 - Altering the level of difficulty in computer adaptive testing JF - Applied Measurement in Education Y1 - 1992 A1 - Bergstrom, Betty A. A1 - Lunz, M. E. A1 - Gershon, R. C. KW - computerized adaptive testing AB - Examines the effect of altering test difficulty on examinee ability measures and test length in a computer adaptive test. The 225 Ss were randomly assigned to 3 test difficulty conditions and given a variable length computer adaptive test. Examinees in the hard, medium, and easy test condition took a test targeted at the 50%, 60%, or 70% probability of correct response. The results show that altering the probability of a correct response does not affect estimation of examinee ability and that taking an easier computer adaptive test only slightly increases the number of items necessary to reach specified levels of precision. (PsycINFO Database Record (c) 2002 APA, all rights reserved). VL - 5 N1 - Lawrence Erlbaum, US ER - TY - CONF T1 - Comparison of item targeting strategies for pass/fail adaptive tests T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1992 A1 - Bergstrom, B. A1 - Gershon, R. C. JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Francisco CA N1 - (ERIC NO. ED 400 287). ER - TY - CONF T1 - Individual differences in computer adaptive testing: Anxiety, computer literacy, and satisfaction T2 - Paper presented at the annual meeting of the National Council on Measurement in Education. Y1 - 1991 A1 - Gershon, R. C. A1 - Bergstrom, B. JF - Paper presented at the annual meeting of the National Council on Measurement in Education. ER - TY - Generic T1 - Test-retest consistency of computer adaptive tests. T2 - annual meeting of the National Council on Measurement in Education Y1 - 1990 A1 - Lunz, M. E. A1 - Bergstrom, Betty A. A1 - Gershon, R. C. JF - annual meeting of the National Council on Measurement in Education CY - Boston, MA USA ER - TY - BOOK T1 - CAT administrator [Computer program] Y1 - 1989 A1 - Gershon, R. C. CY - Chicago: Micro Connections ER -