%0 Journal Article %J Journal of Educational Measurement %D 2020 %T Item Calibration Methods With Multiple Subscale Multistage Testing %A Wang, Chun %A Chen, Ping %A Jiang, Shengyu %K EM %K marginal maximum likelihood %K missing data %K multistage testing %X Abstract Many large-scale educational surveys have moved from linear form design to multistage testing (MST) design. One advantage of MST is that it can provide more accurate latent trait (θ) estimates using fewer items than required by linear tests. However, MST generates incomplete response data by design; hence, questions remain as to how to calibrate items using the incomplete data from MST design. Further complication arises when there are multiple correlated subscales per test, and when items from different subscales need to be calibrated according to their respective score reporting metric. The current calibration-per-subscale method produced biased item parameters, and there is no available method for resolving the challenge. Deriving from the missing data principle, we showed when calibrating all items together the Rubin's ignorability assumption is satisfied such that the traditional single-group calibration is sufficient. When calibrating items per subscale, we proposed a simple modification to the current calibration-per-subscale method that helps reinstate the missing-at-random assumption and therefore corrects for the estimation bias that is otherwise existent. Three mainstream calibration methods are discussed in the context of MST, they are the marginal maximum likelihood estimation, the expectation maximization method, and the fixed parameter calibration. An extensive simulation study is conducted and a real data example from NAEP is analyzed to provide convincing empirical evidence. %B Journal of Educational Measurement %V 57 %P 3-28 %U https://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.12241 %R 10.1111/jedm.12241