How Often Is the Misfit of Item Response Theory Models Practically Significant?
Abstract
Standard 3.9 of the Standards for Educational and Psychological Testing (1999) demands evidence of model fit when item response theory (IRT) models are employed to data from tests. Hambleton and Han (2005) and Sinharay (2005) recommended the assessment of practical significance of misfit of IRT models, but few examples of such assessment can be found in the literature concerning IRT model fit. In this article, practical significance of misfit of IRT models was assessed using data from several tests that employ IRT models to report scores. The IRT model did not fit any data set
considered in this article. However, the extent of practical significance of misfit varied over the data sets.