dc.description.abstract | To address the lack of attention to construct shift in item response theory (IRT) vertical scaling, a multigroup, bifactor model was proposed to model the common dimension for all grades and the grade-specific dimensions. Bifactor model estimation accuracy was evaluated through a simulation study with manipulated factors of percentage of common items, sample size, and degree of construct shift. In addition, the unidimensional IRT (UIRT) model, which ignores construct shift, was also estimated to represent current practice. It was found that (a) bifactor models were well recovered overall, though the grade-specific dimensions were not as well recovered as the general dimension; (b) item discrimination parameter estimates were overestimated in UIRT models due to the effect of construct shift; (c) the person parameters of UIRT models were less accurately estimated than those of bifactor models; (d) group mean parameter estimates from UIRT models were less accurate than those of bifactor models; and (e) a large effect due to construct shift was found for the group mean parameter estimates of UIRT
models. A real data analysis provided an illustration of how bifactor models can be applied to problems involving vertical scaling with construct shift. General procedures for testing practice were recommended and discussed. | es_ES |