%0 Journal Article %J Sleep %D 2010 %T Development and validation of patient-reported outcome measures for sleep disturbance and sleep-related impairments %A Buysse, D. J. %A Yu, L. %A Moul, D. E. %A Germain, A. %A Stover, A. %A Dodds, N. E. %A Johnston, K. L. %A Shablesky-Cade, M. A. %A Pilkonis, P. A. %K *Outcome Assessment (Health Care) %K *Self Disclosure %K Adult %K Aged %K Aged, 80 and over %K Cross-Sectional Studies %K Factor Analysis, Statistical %K Female %K Humans %K Male %K Middle Aged %K Psychometrics %K Questionnaires %K Reproducibility of Results %K Sleep Disorders/*diagnosis %K Young Adult %X STUDY OBJECTIVES: To develop an archive of self-report questions assessing sleep disturbance and sleep-related impairments (SRI), to develop item banks from this archive, and to validate and calibrate the item banks using classic validation techniques and item response theory analyses in a sample of clinical and community participants. DESIGN: Cross-sectional self-report study. SETTING: Academic medical center and participant homes. PARTICIPANTS: One thousand nine hundred ninety-three adults recruited from an Internet polling sample and 259 adults recruited from medical, psychiatric, and sleep clinics. INTERVENTIONS: None. MEASUREMENTS AND RESULTS: This study was part of PROMIS (Patient-Reported Outcomes Information System), a National Institutes of Health Roadmap initiative. Self-report item banks were developed through an iterative process of literature searches, collecting and sorting items, expert content review, qualitative patient research, and pilot testing. Internal consistency, convergent validity, and exploratory and confirmatory factor analysis were examined in the resulting item banks. Factor analyses identified 2 preliminary item banks, sleep disturbance and SRI. Item response theory analyses and expert content review narrowed the item banks to 27 and 16 items, respectively. Validity of the item banks was supported by moderate to high correlations with existing scales and by significant differences in sleep disturbance and SRI scores between participants with and without sleep disorders. CONCLUSIONS: The PROMIS sleep disturbance and SRI item banks have excellent measurement properties and may prove to be useful for assessing general aspects of sleep and SRI with various groups of patients and interventions. %B Sleep %7 2010/06/17 %V 33 %P 781-92 %8 Jun 1 %@ 0161-8105 (Print)0161-8105 (Linking) %G eng %M 20550019 %2 2880437 %0 Journal Article %J Quality of Life Research %D 2009 %T Measuring global physical health in children with cerebral palsy: Illustration of a multidimensional bi-factor model and computerized adaptive testing %A Haley, S. M. %A Ni, P. %A Dumas, H. M. %A Fragala-Pinkham, M. A. %A Hambleton, R. K. %A Montpetit, K. %A Bilodeau, N. %A Gorton, G. E. %A Watson, K. %A Tucker, C. A. %K *Computer Simulation %K *Health Status %K *Models, Statistical %K Adaptation, Psychological %K Adolescent %K Cerebral Palsy/*physiopathology %K Child %K Child, Preschool %K Factor Analysis, Statistical %K Female %K Humans %K Male %K Massachusetts %K Pennsylvania %K Questionnaires %K Young Adult %X PURPOSE: The purposes of this study were to apply a bi-factor model for the determination of test dimensionality and a multidimensional CAT using computer simulations of real data for the assessment of a new global physical health measure for children with cerebral palsy (CP). METHODS: Parent respondents of 306 children with cerebral palsy were recruited from four pediatric rehabilitation hospitals and outpatient clinics. We compared confirmatory factor analysis results across four models: (1) one-factor unidimensional; (2) two-factor multidimensional (MIRT); (3) bi-factor MIRT with fixed slopes; and (4) bi-factor MIRT with varied slopes. We tested whether the general and content (fatigue and pain) person score estimates could discriminate across severity and types of CP, and whether score estimates from a simulated CAT were similar to estimates based on the total item bank, and whether they correlated as expected with external measures. RESULTS: Confirmatory factor analysis suggested separate pain and fatigue sub-factors; all 37 items were retained in the analyses. From the bi-factor MIRT model with fixed slopes, the full item bank scores discriminated across levels of severity and types of CP, and compared favorably to external instruments. CAT scores based on 10- and 15-item versions accurately captured the global physical health scores. CONCLUSIONS: The bi-factor MIRT CAT application, especially the 10- and 15-item versions, yielded accurate global physical health scores that discriminated across known severity groups and types of CP, and correlated as expected with concurrent measures. The CATs have potential for collecting complex data on the physical health of children with CP in an efficient manner. %B Quality of Life Research %7 2009/02/18 %V 18 %P 359-370 %8 Apr %@ 0962-9343 (Print)0962-9343 (Linking) %G eng %M 19221892 %2 2692519 %0 Journal Article %J Archives of Physical Medicine and Rehabilitation %D 2008 %T Computerized adaptive testing for follow-up after discharge from inpatient rehabilitation: II. Participation outcomes %A Haley, S. M. %A Gandek, B. %A Siebens, H. %A Black-Schaffer, R. M. %A Sinclair, S. J. %A Tao, W. %A Coster, W. J. %A Ni, P. %A Jette, A. M. %K *Activities of Daily Living %K *Adaptation, Physiological %K *Computer Systems %K *Questionnaires %K Adult %K Aged %K Aged, 80 and over %K Chi-Square Distribution %K Factor Analysis, Statistical %K Female %K Humans %K Longitudinal Studies %K Male %K Middle Aged %K Outcome Assessment (Health Care)/*methods %K Patient Discharge %K Prospective Studies %K Rehabilitation/*standards %K Subacute Care/*standards %X OBJECTIVES: To measure participation outcomes with a computerized adaptive test (CAT) and compare CAT and traditional fixed-length surveys in terms of score agreement, respondent burden, discriminant validity, and responsiveness. DESIGN: Longitudinal, prospective cohort study of patients interviewed approximately 2 weeks after discharge from inpatient rehabilitation and 3 months later. SETTING: Follow-up interviews conducted in patient's home setting. PARTICIPANTS: Adults (N=94) with diagnoses of neurologic, orthopedic, or medically complex conditions. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: Participation domains of mobility, domestic life, and community, social, & civic life, measured using a CAT version of the Participation Measure for Postacute Care (PM-PAC-CAT) and a 53-item fixed-length survey (PM-PAC-53). RESULTS: The PM-PAC-CAT showed substantial agreement with PM-PAC-53 scores (intraclass correlation coefficient, model 3,1, .71-.81). On average, the PM-PAC-CAT was completed in 42% of the time and with only 48% of the items as compared with the PM-PAC-53. Both formats discriminated across functional severity groups. The PM-PAC-CAT had modest reductions in sensitivity and responsiveness to patient-reported change over a 3-month interval as compared with the PM-PAC-53. CONCLUSIONS: Although continued evaluation is warranted, accurate estimates of participation status and responsiveness to change for group-level analyses can be obtained from CAT administrations, with a sizeable reduction in respondent burden. %B Archives of Physical Medicine and Rehabilitation %7 2008/01/30 %V 89 %P 275-283 %8 Feb %@ 1532-821X (Electronic)0003-9993 (Linking) %G eng %M 18226651 %2 2666330 %0 Journal Article %J Quality of Life Research %D 2007 %T Developing tailored instruments: item banking and computerized adaptive assessment %A Bjorner, J. B. %A Chang, C-H. %A Thissen, D. %A Reeve, B. B. %K *Health Status %K *Health Status Indicators %K *Mental Health %K *Outcome Assessment (Health Care) %K *Quality of Life %K *Questionnaires %K *Software %K Algorithms %K Factor Analysis, Statistical %K Humans %K Models, Statistical %K Psychometrics %X Item banks and Computerized Adaptive Testing (CAT) have the potential to greatly improve the assessment of health outcomes. This review describes the unique features of item banks and CAT and discusses how to develop item banks. In CAT, a computer selects the items from an item bank that are most relevant for and informative about the particular respondent; thus optimizing test relevance and precision. Item response theory (IRT) provides the foundation for selecting the items that are most informative for the particular respondent and for scoring responses on a common metric. The development of an item bank is a multi-stage process that requires a clear definition of the construct to be measured, good items, a careful psychometric analysis of the items, and a clear specification of the final CAT. The psychometric analysis needs to evaluate the assumptions of the IRT model such as unidimensionality and local independence; that the items function the same way in different subgroups of the population; and that there is an adequate fit between the data and the chosen item response models. Also, interpretation guidelines need to be established to help the clinical application of the assessment. Although medical research can draw upon expertise from educational testing in the development of item banks and CAT, the medical field also encounters unique opportunities and challenges. %B Quality of Life Research %7 2007/05/29 %V 16 %P 95-108 %@ 0962-9343 (Print) %G eng %M 17530450 %0 Journal Article %J Journal of Clinical Epidemiology %D 2006 %T Computer adaptive testing improved accuracy and precision of scores over random item selection in a physical functioning item bank %A Haley, S. M. %A Ni, P. %A Hambleton, R. K. %A Slavin, M. D. %A Jette, A. M. %K *Recovery of Function %K Activities of Daily Living %K Adolescent %K Adult %K Aged %K Aged, 80 and over %K Confidence Intervals %K Factor Analysis, Statistical %K Female %K Humans %K Male %K Middle Aged %K Outcome Assessment (Health Care)/*methods %K Rehabilitation/*standards %K Reproducibility of Results %K Software %X BACKGROUND AND OBJECTIVE: Measuring physical functioning (PF) within and across postacute settings is critical for monitoring outcomes of rehabilitation; however, most current instruments lack sufficient breadth and feasibility for widespread use. Computer adaptive testing (CAT), in which item selection is tailored to the individual patient, holds promise for reducing response burden, yet maintaining measurement precision. We calibrated a PF item bank via item response theory (IRT), administered items with a post hoc CAT design, and determined whether CAT would improve accuracy and precision of score estimates over random item selection. METHODS: 1,041 adults were interviewed during postacute care rehabilitation episodes in either hospital or community settings. Responses for 124 PF items were calibrated using IRT methods to create a PF item bank. We examined the accuracy and precision of CAT-based scores compared to a random selection of items. RESULTS: CAT-based scores had higher correlations with the IRT-criterion scores, especially with short tests, and resulted in narrower confidence intervals than scores based on a random selection of items; gains, as expected, were especially large for low and high performing adults. CONCLUSION: The CAT design may have important precision and efficiency advantages for point-of-care functional assessment in rehabilitation practice settings. %B Journal of Clinical Epidemiology %7 2006/10/10 %V 59 %P 1174-82 %8 Nov %@ 0895-4356 (Print) %G eng %M 17027428 %0 Journal Article %J Archives of Physical Medicine and Rehabilitation %D 2006 %T Computerized adaptive testing for follow-up after discharge from inpatient rehabilitation: I. Activity outcomes %A Haley, S. M. %A Siebens, H. %A Coster, W. J. %A Tao, W. %A Black-Schaffer, R. M. %A Gandek, B. %A Sinclair, S. J. %A Ni, P. %K *Activities of Daily Living %K *Adaptation, Physiological %K *Computer Systems %K *Questionnaires %K Adult %K Aged %K Aged, 80 and over %K Chi-Square Distribution %K Factor Analysis, Statistical %K Female %K Humans %K Longitudinal Studies %K Male %K Middle Aged %K Outcome Assessment (Health Care)/*methods %K Patient Discharge %K Prospective Studies %K Rehabilitation/*standards %K Subacute Care/*standards %X OBJECTIVE: To examine score agreement, precision, validity, efficiency, and responsiveness of a computerized adaptive testing (CAT) version of the Activity Measure for Post-Acute Care (AM-PAC-CAT) in a prospective, 3-month follow-up sample of inpatient rehabilitation patients recently discharged home. DESIGN: Longitudinal, prospective 1-group cohort study of patients followed approximately 2 weeks after hospital discharge and then 3 months after the initial home visit. SETTING: Follow-up visits conducted in patients' home setting. PARTICIPANTS: Ninety-four adults who were recently discharged from inpatient rehabilitation, with diagnoses of neurologic, orthopedic, and medically complex conditions. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: Summary scores from AM-PAC-CAT, including 3 activity domains of movement and physical, personal care and instrumental, and applied cognition were compared with scores from a traditional fixed-length version of the AM-PAC with 66 items (AM-PAC-66). RESULTS: AM-PAC-CAT scores were in good agreement (intraclass correlation coefficient model 3,1 range, .77-.86) with scores from the AM-PAC-66. On average, the CAT programs required 43% of the time and 33% of the items compared with the AM-PAC-66. Both formats discriminated across functional severity groups. The standardized response mean (SRM) was greater for the movement and physical fixed form than the CAT; the effect size and SRM of the 2 other AM-PAC domains showed similar sensitivity between CAT and fixed formats. Using patients' own report as an anchor-based measure of change, the CAT and fixed length formats were comparable in responsiveness to patient-reported change over a 3-month interval. CONCLUSIONS: Accurate estimates for functional activity group-level changes can be obtained from CAT administrations, with a considerable reduction in administration time. %B Archives of Physical Medicine and Rehabilitation %7 2006/08/01 %V 87 %P 1033-42 %8 Aug %@ 0003-9993 (Print) %G eng %M 16876547 %0 Journal Article %J Medical Care %D 2006 %T Overview of quantitative measurement methods. Equivalence, invariance, and differential item functioning in health applications %A Teresi, J. A. %K *Cross-Cultural Comparison %K Data Interpretation, Statistical %K Factor Analysis, Statistical %K Guidelines as Topic %K Humans %K Models, Statistical %K Psychometrics/*methods %K Statistics as Topic/*methods %K Statistics, Nonparametric %X BACKGROUND: Reviewed in this article are issues relating to the study of invariance and differential item functioning (DIF). The aim of factor analyses and DIF, in the context of invariance testing, is the examination of group differences in item response conditional on an estimate of disability. Discussed are parameters and statistics that are not invariant and cannot be compared validly in crosscultural studies with varying distributions of disability in contrast to those that can be compared (if the model assumptions are met) because they are produced by models such as linear and nonlinear regression. OBJECTIVES: The purpose of this overview is to provide an integrated approach to the quantitative methods used in this special issue to examine measurement equivalence. The methods include classical test theory (CTT), factor analytic, and parametric and nonparametric approaches to DIF detection. Also included in the quantitative section is a discussion of item banking and computerized adaptive testing (CAT). METHODS: Factorial invariance and the articles discussing this topic are introduced. A brief overview of the DIF methods presented in the quantitative section of the special issue is provided together with a discussion of ways in which DIF analyses and examination of invariance using factor models may be complementary. CONCLUSIONS: Although factor analytic and DIF detection methods share features, they provide unique information and can be viewed as complementary in informing about measurement equivalence. %B Medical Care %7 2006/10/25 %V 44 %P S39-49 %8 Nov %@ 0025-7079 (Print)0025-7079 (Linking) %G eng %M 17060834 %0 Journal Article %J Journal of Clinical Epidemiology %D 2006 %T Simulated computerized adaptive test for patients with shoulder impairments was efficient and produced valid measures of function %A Hart, D. L. %A Cook, K. F. %A Mioduski, J. E. %A Teal, C. R. %A Crane, P. K. %K *Computer Simulation %K *Range of Motion, Articular %K Activities of Daily Living %K Adult %K Aged %K Aged, 80 and over %K Factor Analysis, Statistical %K Female %K Humans %K Male %K Middle Aged %K Prospective Studies %K Reproducibility of Results %K Research Support, N.I.H., Extramural %K Research Support, U.S. Gov't, Non-P.H.S. %K Shoulder Dislocation/*physiopathology/psychology/rehabilitation %K Shoulder Pain/*physiopathology/psychology/rehabilitation %K Shoulder/*physiopathology %K Sickness Impact Profile %K Treatment Outcome %X BACKGROUND AND OBJECTIVE: To test unidimensionality and local independence of a set of shoulder functional status (SFS) items, develop a computerized adaptive test (CAT) of the items using a rating scale item response theory model (RSM), and compare discriminant validity of measures generated using all items (theta(IRT)) and measures generated using the simulated CAT (theta(CAT)). STUDY DESIGN AND SETTING: We performed a secondary analysis of data collected prospectively during rehabilitation of 400 patients with shoulder impairments who completed 60 SFS items. RESULTS: Factor analytic techniques supported that the 42 SFS items formed a unidimensional scale and were locally independent. Except for five items, which were deleted, the RSM fit the data well. The remaining 37 SFS items were used to generate the CAT. On average, 6 items were needed to estimate precise measures of function using the SFS CAT, compared with all 37 SFS items. The theta(IRT) and theta(CAT) measures were highly correlated (r = .96) and resulted in similar classifications of patients. CONCLUSION: The simulated SFS CAT was efficient and produced precise, clinically relevant measures of functional status with good discriminating ability. %B Journal of Clinical Epidemiology %V 59 %P 290-8 %G eng %M 16488360 %0 Journal Article %J Journal of Clinical Epidemiology %D 2005 %T An item bank was created to improve the measurement of cancer-related fatigue %A Lai, J-S. %A Cella, D. %A Dineen, K. %A Bode, R. %A Von Roenn, J. %A Gershon, R. C. %A Shevrin, D. %K Adult %K Aged %K Aged, 80 and over %K Factor Analysis, Statistical %K Fatigue/*etiology/psychology %K Female %K Humans %K Male %K Middle Aged %K Neoplasms/*complications/psychology %K Psychometrics %K Questionnaires %X OBJECTIVE: Cancer-related fatigue (CRF) is one of the most common unrelieved symptoms experienced by patients. CRF is underrecognized and undertreated due to a lack of clinically sensitive instruments that integrate easily into clinics. Modern computerized adaptive testing (CAT) can overcome these obstacles by enabling precise assessment of fatigue without requiring the administration of a large number of questions. A working item bank is essential for development of a CAT platform. The present report describes the building of an operational item bank for use in clinical settings with the ultimate goal of improving CRF identification and treatment. STUDY DESIGN AND SETTING: The sample included 301 cancer patients. Psychometric properties of items were examined by using Rasch analysis, an Item Response Theory (IRT) model. RESULTS AND CONCLUSION: The final bank includes 72 items. These 72 unidimensional items explained 57.5% of the variance, based on factor analysis results. Excellent internal consistency (alpha=0.99) and acceptable item-total correlation were found (range: 0.51-0.85). The 72 items covered a reasonable range of the fatigue continuum. No significant ceiling effects, floor effects, or gaps were found. A sample short form was created for demonstration purposes. The resulting bank is amenable to the development of a CAT platform. %B Journal of Clinical Epidemiology %7 2005/02/01 %V 58 %P 190-7 %8 Feb %@ 0895-4356 (Print)0895-4356 (Linking) %G eng %9 Multicenter Study %M 15680754 %0 Journal Article %J Medical Care %D 2004 %T Activity outcome measurement for postacute care %A Haley, S. M. %A Coster, W. J. %A Andres, P. L. %A Ludlow, L. H. %A Ni, P. %A Bond, T. L. %A Sinclair, S. J. %A Jette, A. M. %K *Self Efficacy %K *Sickness Impact Profile %K Activities of Daily Living/*classification/psychology %K Adult %K Aftercare/*standards/statistics & numerical data %K Aged %K Boston %K Cognition/physiology %K Disability Evaluation %K Factor Analysis, Statistical %K Female %K Human %K Male %K Middle Aged %K Movement/physiology %K Outcome Assessment (Health Care)/*methods/statistics & numerical data %K Psychometrics %K Questionnaires/standards %K Rehabilitation/*standards/statistics & numerical data %K Reproducibility of Results %K Sensitivity and Specificity %K Support, U.S. Gov't, Non-P.H.S. %K Support, U.S. Gov't, P.H.S. %X BACKGROUND: Efforts to evaluate the effectiveness of a broad range of postacute care services have been hindered by the lack of conceptually sound and comprehensive measures of outcomes. It is critical to determine a common underlying structure before employing current methods of item equating across outcome instruments for future item banking and computer-adaptive testing applications. OBJECTIVE: To investigate the factor structure, reliability, and scale properties of items underlying the Activity domains of the International Classification of Functioning, Disability and Health (ICF) for use in postacute care outcome measurement. METHODS: We developed a 41-item Activity Measure for Postacute Care (AM-PAC) that assessed an individual's execution of discrete daily tasks in his or her own environment across major content domains as defined by the ICF. We evaluated the reliability and discriminant validity of the prototype AM-PAC in 477 individuals in active rehabilitation programs across 4 rehabilitation settings using factor analyses, tests of item scaling, internal consistency reliability analyses, Rasch item response theory modeling, residual component analysis, and modified parallel analysis. RESULTS: Results from an initial exploratory factor analysis produced 3 distinct, interpretable factors that accounted for 72% of the variance: Applied Cognition (44%), Personal Care & Instrumental Activities (19%), and Physical & Movement Activities (9%); these 3 activity factors were verified by a confirmatory factor analysis. Scaling assumptions were met for each factor in the total sample and across diagnostic groups. Internal consistency reliability was high for the total sample (Cronbach alpha = 0.92 to 0.94), and for specific diagnostic groups (Cronbach alpha = 0.90 to 0.95). Rasch scaling, residual factor, differential item functioning, and modified parallel analyses supported the unidimensionality and goodness of fit of each unique activity domain. CONCLUSIONS: This 3-factor model of the AM-PAC can form the conceptual basis for common-item equating and computer-adaptive applications, leading to a comprehensive system of outcome instruments for postacute care settings. %B Medical Care %V 42 %P I49-161 %G eng %M 14707755 %0 Journal Article %J BMC Psychiatry %D 2004 %T Computerized adaptive measurement of depression: A simulation study %A Gardner, W. %A Shear, K. %A Kelleher, K. J. %A Pajer, K. A. %A Mammen, O. %A Buysse, D. %A Frank, E. %K *Computer Simulation %K Adult %K Algorithms %K Area Under Curve %K Comparative Study %K Depressive Disorder/*diagnosis/epidemiology/psychology %K Diagnosis, Computer-Assisted/*methods/statistics & numerical data %K Factor Analysis, Statistical %K Female %K Humans %K Internet %K Male %K Mass Screening/methods %K Patient Selection %K Personality Inventory/*statistics & numerical data %K Pilot Projects %K Prevalence %K Psychiatric Status Rating Scales/*statistics & numerical data %K Psychometrics %K Research Support, Non-U.S. Gov't %K Research Support, U.S. Gov't, P.H.S. %K Severity of Illness Index %K Software %X Background: Efficient, accurate instruments for measuring depression are increasingly importantin clinical practice. We developed a computerized adaptive version of the Beck DepressionInventory (BDI). We examined its efficiency and its usefulness in identifying Major DepressiveEpisodes (MDE) and in measuring depression severity.Methods: Subjects were 744 participants in research studies in which each subject completed boththe BDI and the SCID. In addition, 285 patients completed the Hamilton Depression Rating Scale.Results: The adaptive BDI had an AUC as an indicator of a SCID diagnosis of MDE of 88%,equivalent to the full BDI. The adaptive BDI asked fewer questions than the full BDI (5.6 versus 21items). The adaptive latent depression score correlated r = .92 with the BDI total score and thelatent depression score correlated more highly with the Hamilton (r = .74) than the BDI total scoredid (r = .70).Conclusions: Adaptive testing for depression may provide greatly increased efficiency withoutloss of accuracy in identifying MDE or in measuring depression severity. %B BMC Psychiatry %V 4 %P 13-23 %G eng %M 15132755 %0 Journal Article %J Medical Care %D 2004 %T Refining the conceptual basis for rehabilitation outcome measurement: personal care and instrumental activities domain %A Coster, W. J. %A Haley, S. M. %A Andres, P. L. %A Ludlow, L. H. %A Bond, T. L. %A Ni, P. S. %K *Self Efficacy %K *Sickness Impact Profile %K Activities of Daily Living/*classification/psychology %K Adult %K Aged %K Aged, 80 and over %K Disability Evaluation %K Factor Analysis, Statistical %K Female %K Humans %K Male %K Middle Aged %K Outcome Assessment (Health Care)/*methods/statistics & numerical data %K Questionnaires/*standards %K Recovery of Function/physiology %K Rehabilitation/*standards/statistics & numerical data %K Reproducibility of Results %K Research Support, U.S. Gov't, Non-P.H.S. %K Research Support, U.S. Gov't, P.H.S. %K Sensitivity and Specificity %X BACKGROUND: Rehabilitation outcome measures routinely include content on performance of daily activities; however, the conceptual basis for item selection is rarely specified. These instruments differ significantly in format, number, and specificity of daily activity items and in the measurement dimensions and type of scale used to specify levels of performance. We propose that a requirement for upper limb and hand skills underlies many activities of daily living (ADL) and instrumental activities of daily living (IADL) items in current instruments, and that items selected based on this definition can be placed along a single functional continuum. OBJECTIVE: To examine the dimensional structure and content coverage of a Personal Care and Instrumental Activities item set and to examine the comparability of items from existing instruments and a set of new items as measures of this domain. METHODS: Participants (N = 477) from 3 different disability groups and 4 settings representing the continuum of postacute rehabilitation care were administered the newly developed Activity Measure for Post-Acute Care (AM-PAC), the SF-8, and an additional setting-specific measure: FIM (in-patient rehabilitation); MDS (skilled nursing facility); MDS-PAC (postacute settings); OASIS (home care); or PF-10 (outpatient clinic). Rasch (partial-credit model) analyses were conducted on a set of 62 items covering the Personal Care and Instrumental domain to examine item fit, item functioning, and category difficulty estimates and unidimensionality. RESULTS: After removing 6 misfitting items, the remaining 56 items fit acceptably along the hypothesized continuum. Analyses yielded different difficulty estimates for the maximum score (eg, "Independent performance") for items with comparable content from different instruments. Items showed little differential item functioning across age, diagnosis, or severity groups, and 92% of the participants fit the model. CONCLUSIONS: ADL and IADL items from existing rehabilitation outcomes instruments that depend on skilled upper limb and hand use can be located along a single continuum, along with the new personal care and instrumental items of the AM-PAC addressing gaps in content. Results support the validity of the proposed definition of the Personal Care and Instrumental Activities dimension of function as a guide for future development of rehabilitation outcome instruments, such as linked, setting-specific short forms and computerized adaptive testing approaches. %B Medical Care %V 42 %P I62-172 %8 Jan %G eng %M 14707756 %0 Journal Article %J Archives of Physical Medicine and Rehabilitation %D 2004 %T Score comparability of short forms and computerized adaptive testing: Simulation study with the activity measure for post-acute care %A Haley, S. M. %A Coster, W. J. %A Andres, P. L. %A Kosinski, M. %A Ni, P. %K Boston %K Factor Analysis, Statistical %K Humans %K Outcome Assessment (Health Care)/*methods %K Prospective Studies %K Questionnaires/standards %K Rehabilitation/*standards %K Subacute Care/*standards %X OBJECTIVE: To compare simulated short-form and computerized adaptive testing (CAT) scores to scores obtained from complete item sets for each of the 3 domains of the Activity Measure for Post-Acute Care (AM-PAC). DESIGN: Prospective study. SETTING: Six postacute health care networks in the greater Boston metropolitan area, including inpatient acute rehabilitation, transitional care units, home care, and outpatient services. PARTICIPANTS: A convenience sample of 485 adult volunteers who were receiving skilled rehabilitation services. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: Inpatient and community-based short forms and CAT applications were developed for each of 3 activity domains (physical & mobility, personal care & instrumental, applied cognition) using item pools constructed from new items and items from existing postacute care instruments. RESULTS: Simulated CAT scores correlated highly with score estimates from the total item pool in each domain (4- and 6-item CAT r range,.90-.95; 10-item CAT r range,.96-.98). Scores on the 10-item short forms constructed for inpatient and community settings also provided good estimates of the AM-PAC item pool scores for the physical & movement and personal care & instrumental domains, but were less consistent in the applied cognition domain. Confidence intervals around individual scores were greater in the short forms than for the CATs. CONCLUSIONS: Accurate scoring estimates for AM-PAC domains can be obtained with either the setting-specific short forms or the CATs. The strong relationship between CAT and item pool scores can be attributed to the CAT's ability to select specific items to match individual responses. The CAT may have additional advantages over short forms in practicality, efficiency, and the potential for providing more precise scoring estimates for individuals. %B Archives of Physical Medicine and Rehabilitation %7 2004/04/15 %V 85 %P 661-6 %8 Apr %@ 0003-9993 (Print) %G eng %M 15083444 %0 Journal Article %J Quality of Life Research %D 2003 %T Calibration of an item pool for assessing the burden of headaches: an application of item response theory to the Headache Impact Test (HIT) %A Bjorner, J. B. %A Kosinski, M. %A Ware, J. E., Jr. %K *Cost of Illness %K *Decision Support Techniques %K *Sickness Impact Profile %K Adolescent %K Adult %K Aged %K Comparative Study %K Disability Evaluation %K Factor Analysis, Statistical %K Headache/*psychology %K Health Surveys %K Human %K Longitudinal Studies %K Middle Aged %K Migraine/psychology %K Models, Psychological %K Psychometrics/*methods %K Quality of Life/*psychology %K Software %K Support, Non-U.S. Gov't %X BACKGROUND: Measurement of headache impact is important in clinical trials, case detection, and the clinical monitoring of patients. Computerized adaptive testing (CAT) of headache impact has potential advantages over traditional fixed-length tests in terms of precision, relevance, real-time quality control and flexibility. OBJECTIVE: To develop an item pool that can be used for a computerized adaptive test of headache impact. METHODS: We analyzed responses to four well-known tests of headache impact from a population-based sample of recent headache sufferers (n = 1016). We used confirmatory factor analysis for categorical data and analyses based on item response theory (IRT). RESULTS: In factor analyses, we found very high correlations between the factors hypothesized by the original test constructers, both within and between the original questionnaires. These results suggest that a single score of headache impact is sufficient. We established a pool of 47 items which fitted the generalized partial credit IRT model. By simulating a computerized adaptive health test we showed that an adaptive test of only five items had a very high concordance with the score based on all items and that different worst-case item selection scenarios did not lead to bias. CONCLUSION: We have established a headache impact item pool that can be used in CAT of headache impact. %B Quality of Life Research %V 12 %P 913-933 %G eng %M 14661767 %0 Journal Article %J Quality of Life Research %D 2003 %T The feasibility of applying item response theory to measures of migraine impact: a re-analysis of three clinical studies %A Bjorner, J. B. %A Kosinski, M. %A Ware, J. E., Jr. %K *Sickness Impact Profile %K Adolescent %K Adult %K Aged %K Comparative Study %K Cost of Illness %K Factor Analysis, Statistical %K Feasibility Studies %K Female %K Human %K Male %K Middle Aged %K Migraine/*psychology %K Models, Psychological %K Psychometrics/instrumentation/*methods %K Quality of Life/*psychology %K Questionnaires %K Support, Non-U.S. Gov't %X BACKGROUND: Item response theory (IRT) is a powerful framework for analyzing multiitem scales and is central to the implementation of computerized adaptive testing. OBJECTIVES: To explain the use of IRT to examine measurement properties and to apply IRT to a questionnaire for measuring migraine impact--the Migraine Specific Questionnaire (MSQ). METHODS: Data from three clinical studies that employed the MSQ-version 1 were analyzed by confirmatory factor analysis for categorical data and by IRT modeling. RESULTS: Confirmatory factor analyses showed very high correlations between the factors hypothesized by the original test constructions. Further, high item loadings on one common factor suggest that migraine impact may be adequately assessed by only one score. IRT analyses of the MSQ were feasible and provided several suggestions as to how to improve the items and in particular the response choices. Out of 15 items, 13 showed adequate fit to the IRT model. In general, IRT scores were strongly associated with the scores proposed by the original test developers and with the total item sum score. Analysis of response consistency showed that more than 90% of the patients answered consistently according to a unidimensional IRT model. For the remaining patients, scores on the dimension of emotional function were less strongly related to the overall IRT scores that mainly reflected role limitations. Such response patterns can be detected easily using response consistency indices. Analysis of test precision across score levels revealed that the MSQ was most precise at one standard deviation worse than the mean impact level for migraine patients that are not in treatment. Thus, gains in test precision can be achieved by developing items aimed at less severe levels of migraine impact. CONCLUSIONS: IRT proved useful for analyzing the MSQ. The approach warrants further testing in a more comprehensive item pool for headache impact that would enable computerized adaptive testing. %B Quality of Life Research %V 12 %P 887-902 %G eng %M 14661765 %0 Journal Article %J Medical Care %D 2002 %T Multidimensional adaptive testing for mental health problems in primary care %A Gardner, W. %A Kelleher, K. J. %A Pajer, K. A. %K Adolescent %K Child %K Child Behavior Disorders/*diagnosis %K Child Health Services/*organization & administration %K Factor Analysis, Statistical %K Female %K Humans %K Linear Models %K Male %K Mass Screening/*methods %K Parents %K Primary Health Care/*organization & administration %X OBJECTIVES: Efficient and accurate instruments for assessing child psychopathology are increasingly important in clinical practice and research. For example, screening in primary care settings can identify children and adolescents with disorders that may otherwise go undetected. However, primary care offices are notorious for the brevity of visits and screening must not burden patients or staff with long questionnaires. One solution is to shorten assessment instruments, but dropping questions typically makes an instrument less accurate. An alternative is adaptive testing, in which a computer selects the items to be asked of a patient based on the patient's previous responses. This research used a simulation to test a child mental health screen based on this technology. RESEARCH DESIGN: Using half of a large sample of data, a computerized version was developed of the Pediatric Symptom Checklist (PSC), a parental-report psychosocial problem screen. With the unused data, a simulation was conducted to determine whether the Adaptive PSC can reproduce the results of the full PSC with greater efficiency. SUBJECTS: PSCs were completed by parents on 21,150 children seen in a national sample of primary care practices. RESULTS: Four latent psychosocial problem dimensions were identified through factor analysis: internalizing problems, externalizing problems, attention problems, and school problems. A simulated adaptive test measuring these traits asked an average of 11.6 questions per patient, and asked five or fewer questions for 49% of the sample. There was high agreement between the adaptive test and the full (35-item) PSC: only 1.3% of screening decisions were discordant (kappa = 0.93). This agreement was higher than that obtained using a comparable length (12-item) short-form PSC (3.2% of decisions discordant; kappa = 0.84). CONCLUSIONS: Multidimensional adaptive testing may be an accurate and efficient technology for screening for mental health problems in primary care settings. %B Medical Care %7 2002/09/10 %V 40 %P 812-23 %8 Sep %@ 0025-7079 (Print)0025-7079 (Linking) %G eng %M 12218771