“No Aid, No Violation/YB”
1. In this equating design, the test takers were randomly assigned to take the old and new forms of the test. Kin and Kolen (2010) pointed out that population invariance was used to ensure that the scale was fixed across all test forms. However, the use of the common item non-equivalent groups design would be more appropriate since the non-repeaters and repeaters differ in their abilities. The use of the common or anchor items will indicate the performance of non-repeaters in comparison to the repeaters (Kolen and Brennan, 2004).
2. According to Livingston (2004), unsmoothing equipercentile equating (USEE) is generally used when the sample size is large. Kim & Kolen (2010) indicated that non-repeaters are more able than repeaters; this may create irregularities in the data. Kim and Kolen (2010) pointed out that the unsmoothed method is most appropriate in the study in order to avoid to the influence of the smoothed equipercentile equating method (used to remove irregularities) on population invariance. For example, smoothing could produce lower standard errors than the USEE Kolen and Brennan (2004). As a result, unsmoothing equipercentile equating method was used.
3. The equally-weighted root expected square difference (ewREMSD) was used in the study to give equal weight to all score points and to examine the impact of the subgroups on the test takers success or failure designations. The standardized root squared difference (RSD) was used to determine the equating difference between the subgroup, where the RSD index compares equating functions for the non-repeaters and repeaters to the total groups. The RWSD compared the subgroups by pairing and assigning an index to them and to detect population dependence of equating for the groups. The purpose of the standardized difference that matters (DTM) was to evaluate the magnitude of the RMSD, RSD, REMSD and the ewREMSD (Kim & Kolen, 2010).
4. According the Kolen and Brennan (2004), the region for the score distribution of equating precision and reliability estimate should cover a range of scores six standard deviations units from -3SD + 3SD. This will enable test psychometricians to represents scores using the normal curve.
5. The non-equivalent group anchor-test (NEAT) would be used to equate multiple forms of the MCAT. Under the NEAT design, different test forms are administered to different populations. The new and reference forms will include a set of common (anchor) items which will serve as the link to conduct the new form equating (Hou, Wang, & Vispoel, 2008). Since there is moderate to large ability difference between the groups (repeaters and non-repeaters), the chained equating (CE) and the poststratification equating (PSE) methods are two widely used equating methods under the NEAT design (Kolen & Brennan, 2004; Hou et al., 2008) that will be used to test the equating conditions. These methods will be examined using the...