Free
Research Article  |   September 2010
Cross-Regional Validation of the School Version of the Assessment of Motor and Process Skills
Author Affiliations
  • Michaela Munkholm, RegOT, MSc, is PhD Candidate, Department of Community Medicine and Rehabilitation, Division of Occupational Therapy, Umeå University, SE-901 87 Umeå, Sweden; michaela.munkholm@occupther.umu.se
  • Brett Berg, OTR, is Data Manager, AMPS Project International, Fort Collins, CO
  • Britta Löfgren, RegOT, PhD, is Assistant Professor, Department of Community Medicine and Rehabilitation, Division of Occupational Therapy, Umeå University, Umeå, Sweden
  • Anne G. Fisher, OT, ScD, is Professor, Department of Community Medicine and Rehabilitation, Division of Occupational Therapy, Umeå University, Umeå, Sweden
Article Information
Education of OTs and OTAs / Neurologic Conditions / Pediatric Evaluation and Intervention / Rehabilitation, Participation, and Disability / School-Based Practice / Work and Industry / Childhood and Youth
Research Article   |   September 2010
Cross-Regional Validation of the School Version of the Assessment of Motor and Process Skills
American Journal of Occupational Therapy, September/October 2010, Vol. 64, 768-775. doi:10.5014/ajot.2010.09041
American Journal of Occupational Therapy, September/October 2010, Vol. 64, 768-775. doi:10.5014/ajot.2010.09041
Abstract

OBJECTIVE. The objective was to determine whether the School Version of the Assessment of Motor and Process Skills (School AMPS) is valid when used to evaluate students in different world regions.

METHOD. Participants were 984 students, ages 3–13 yr, from North America, Australia and New Zealand, United Kingdom, and the Nordic countries, matched for age and diagnoses. We used FACETS many-faceted Rasch analyses to generate item difficulty calibrations by region and evaluate for significant differential item functioning (DIF) and differential test functioning (DTF).

RESULTS. Four School AMPS items demonstrated DIF but resulted in no DTF.

CONCLUSION. This study provided support for occupational therapists using the School AMPS to evaluate students’ quality of schoolwork task performances across regions because the School AMPS measures are free of bias associated with world region.

The purpose of this study was to evaluate whether the School Version of the Assessment of Motor and Process Skills (School AMPS; Fisher, Bryze, Hume, & Griswold, 2007) is valid for evaluating students in different world regions. More specifically, our plan was to evaluate for cross-regional differential item functioning (DIF) as a means of examining the validity of the School AMPS scales. If the School AMPS is free of DIF among world regions, the school motor and school process skill item hierarchies will be the same (i.e., skill item calibrations will remain stable), such that easy items are easier for students in all regions and harder items are harder for students in all regions. By contrast, DIF occurs when groups (e.g., students in different world regions) who are supposed to be comparable on the construct being measured exhibit differences in skill item calibrations among groups (Wilson, 2005).
Presence of DIF can result in students from one world region being placed at a disadvantage when their School AMPS results are compared with the School AMPS results of students from other regions. Therefore, it is also important to determine whether DIF leads to differential test functioning (DTF). DTF is important because the interpretation of a student’s measure and any subsequent decisions are made on the basis of the performance on the whole test, not individual items (Boomsma, van Duijn, & Snijders, 2001; Pae & Park, 2006; Wright & Stone, 1999). When such decisions are made, therefore, it is important that they be based on valid interpretations of the results of the School AMPS evaluation.
The School AMPS (Fisher et al., 2007) is an internationally used occupational therapy assessment developed in response to a need for a valid and clinically useful tool for measuring the quality of schoolwork task performance as it is observed within the natural classroom setting. When the School AMPS is administered, the occupational therapist observes the student perform at least two schoolwork tasks and later scores the student’s observed quality of schoolwork task performance of the 16 school motor skills and 20 school process skills (Figure 1). These skills are observable, goal-directed actions carried out one by one when performing a schoolwork task. When actions are linked together, they result in a chain of actions that are the schoolwork task performance (Fisher, 2006).
Figure 1.
Occupational performance skills included in the School Version of the Assessment of Motor and Process Skills (skill items).
Figure 1.
Occupational performance skills included in the School Version of the Assessment of Motor and Process Skills (skill items).
×
More specifically, the school motor and school process skills are the smallest observable actions of occupational performance (performance skills). Hence, it is important to stress that the school motor and school process skills represent the smallest observable units of occupational performance, not underlying body functions (Fisher, 2006). Each School AMPS skill (item) is rated in terms of any observed increase in physical effort, decrease in efficiency, decrease in safety, and frequency of assistance provided in relation to that action. For example, if a student is instructed by the teacher and then observed to fill in short answers in a workbook, he or she is scored on the School AMPS task WR–3, Short answer (numbers or words). As the student is writing, the occupational therapist observes the student’s quality of performance when he or she is reaching for, grasping, choosing, and lifting a pencil; gathering the pencil to the paper; and then initiating writing a word. When scoring the school motor skill Reaches, the occupational therapist would then consider the standardized scoring examples for Reaches in the School AMPS manual (Fisher et al., 2007), and score Reaches = 4 if the student readily reached for the pencil without evidence of any increased physical effort or delay. The occupational therapist would score Reaches = 3 if he or she questioned whether the student demonstrated increased effort when reaching for the pencil. The occupational therapist would score Reaches = 2 if he or she clearly observed increased effort or inefficient use of time when reaching for the pencil. Finally, the occupational therapist would score Reaches = 1 if the student attempted to reach but was unable to secure the pencil by himself or herself, needed assistance, or was at imminent risk of falling when reaching for the pencil.
The 26 schoolwork tasks included in the School AMPS manual range from simple to complex and are divided into five categories: pen-and-pencil writing tasks, drawing and coloring tasks, cutting and pasting tasks, computer writing tasks, and math manipulative tasks. These tasks are among the tasks most commonly performed in preschool and elementary school settings. The tasks are defined broadly enough to allow for teacher-specified variations in how the task is to be performed. The School AMPS items, therefore, are thought to be free of influence from culture or geographic region. If this assumption is true, students from any world region can be observed without bias when performing School AMPS schoolwork tasks, despite cultural variations. That is, DIF and, in turn, DTF should not be present.
Preliminary evaluation for presence of DIF in the school motor and school process skill items and tasks among three major world regions (North America, Europe, and Australasia) using many-faceted Rasch (MFR; Fisher, 1993; Linacre, 1993) analyses revealed that three school motor items (Walks, Endures, and Paces), no school process items, and none of the schoolwork tasks demonstrated DIF (Fisher et al., 2007). Our plan was to extend these preliminary results by (1) including all world regions with sufficient data in the School AMPS database (i.e., n ≥ 200; Linacre, 1994; Tennant & Pallant, 2007; Zumbo, 1999); (2) separating Europe into three regions: United Kingdom, other European countries, and the Nordic countries; and (3) analyzing for DTF among all these regions. More specifically, our goal was to contribute to validation of the cross-regional usefulness of the School AMPS by answering the following research questions:
  1. Do the school motor and school process skill item difficulty calibrations differ among world regions?

  2. Do the school motor and school process skill item difficulty calibrations for each included region differ from those of a combined sample that includes all regions?

  3. If DIF is present, does it affect the students’ quality of schoolwork performance measures (i.e., is there evidence of DTF)?

Method
Participants
After receiving human subjects approval from Umeå University’s research ethics committee, participants were selected from the sample of all students ages 3–15 yr in the School AMPS database. Students with (1) educational (e.g., reading disorder), medical (e.g., autism, cerebral palsy), or occupational therapy (e.g., sensory integrative disorder) diagnoses; (2) students at risk (i.e., students without an identified disability but whose teachers had expressed concerns related to behavioral problems or risk for academic failure (Heward, 2000); and (3) typically developing students with no known diagnoses or behavior problems were selected. In total, 4,043 students met the inclusion criteria. All students were evaluated by raters who had been trained to administer and score the School AMPS in a valid and reliable manner according to the procedures outlined in the School AMPS manual (Fisher et al., 2007). Students were divided into groups by world region: North America (NA), Australia and New Zealand (ANZ), Asia, South America, United Kingdom (UK), other European countries, and Nordic countries.
Because insufficient data from Asia (n = 52), South America (n = 40), and other European countries (n = 147) were available, data from those regions were excluded. Students from the four remaining regions were then matched for age and diagnoses; the person who performed the matching was blind to the student’s school motor and school process quality of schoolwork performance measures. The final sample consisted of 984 students (246 students from each region) ranging in age from 3 to 13 yr (Table 1). Gender was not considered in this study because previous research has shown that no DIF is associated with gender in the School AMPS (Fisher et al., 2007).
Table 1.
Description of the Sample Characteristics by World Region
Description of the Sample Characteristics by World Region×
North America (n = 246)United Kingdom (n = 246)Nordic Countries (n = 246)Australia and New Zealand (n = 246)Total (N = 984)
Gender
 Boys173170171174688
 Girls73767572296
Diagnosis groups
 Typically developing37373737148
 At risk1717171768
 Mild46464646184
 Developmental disabilities or neurologic disorders110110110110440
 Cognitive or psychiatric diagnoses999936
 Other or unknown diagnoses or multiple diagnoses27272727108
Table Footer NoteNote. Typically developing = students without disabilities; at risk = students at risk for academic or developmental delays, determined on the basis of teachers’ concerns; mild = mild educational disabilities (i.e., developmental coordination disorder, attention deficits, speech and language disorders). Mean age (SD) = 6.86 (2.3) years.
Note. Typically developing = students without disabilities; at risk = students at risk for academic or developmental delays, determined on the basis of teachers’ concerns; mild = mild educational disabilities (i.e., developmental coordination disorder, attention deficits, speech and language disorders). Mean age (SD) = 6.86 (2.3) years.×
Table 1.
Description of the Sample Characteristics by World Region
Description of the Sample Characteristics by World Region×
North America (n = 246)United Kingdom (n = 246)Nordic Countries (n = 246)Australia and New Zealand (n = 246)Total (N = 984)
Gender
 Boys173170171174688
 Girls73767572296
Diagnosis groups
 Typically developing37373737148
 At risk1717171768
 Mild46464646184
 Developmental disabilities or neurologic disorders110110110110440
 Cognitive or psychiatric diagnoses999936
 Other or unknown diagnoses or multiple diagnoses27272727108
Table Footer NoteNote. Typically developing = students without disabilities; at risk = students at risk for academic or developmental delays, determined on the basis of teachers’ concerns; mild = mild educational disabilities (i.e., developmental coordination disorder, attention deficits, speech and language disorders). Mean age (SD) = 6.86 (2.3) years.
Note. Typically developing = students without disabilities; at risk = students at risk for academic or developmental delays, determined on the basis of teachers’ concerns; mild = mild educational disabilities (i.e., developmental coordination disorder, attention deficits, speech and language disorders). Mean age (SD) = 6.86 (2.3) years.×
×
Instrumentation
Because the School AMPS is designed to be used to evaluate the quality of the student’s schoolwork task performance in the natural classroom setting, the administration of the School AMPS begins with the occupational therapist’s interview of the teacher to gain an understanding of the student, his or her performance context (classroom environment, daily school routines), and which schoolwork tasks the teacher identifies as problematic or as presenting a challenge for the student. After the interview, the occupational therapist unobtrusively observes the student in his or her natural classroom environment, during normal classroom routines, as he or she completes a minimum of two schoolwork tasks that have been assigned by the teacher. During the observation, the occupational therapist takes observational notes that he or she later uses to score the quality of the student’s schoolwork task performance. The School AMPS administration has been described in more detail elsewhere (Fisher et al., 2007; Munkholm & Fisher, 2008).
After observing the student, the occupational therapist separately scores the quality of observed performance by using the detailed scoring criteria for the 16 school motor and 20 school process skill items; a 4-point rating scale (4 = competent, 3 = questionable, 2 = ineffective, 1 = deficit) is used. When scoring is completed, the occupational therapist enters the 16 school motor and 20 school process skill item raw scores for each task observed into the School AMPS computer-scoring program (Three Star Press, 2005). The School AMPS computer-scoring program is used to implement MFR analyses (Fisher, 1993; Linacre, 1993), which convert the student’s raw scores into two linear quality of schoolwork performance measures—one for school motor quality of performance and one for school process quality of performance. During this conversion of raw scores into the linearized measures, the computer-scoring program adjusts the final quality of schoolwork performance estimates by accounting for the difficulty of the school motor and school process skill items, the challenge of the tasks performed, and the severity of the rater who scored the performance. That is, the MFR analyses simultaneously consider four facets (items, tasks, students, and raters) in the estimation of each student’s quality of schoolwork performance measures.
Previous reliability and validity studies have shown evidence of internal scale validity, person response validity, and rater reliability in the School AMPS (Atchison, Fisher, & Bryze, 1998; Fisher, Bryze, & Atchison, 2000; Fisher et al., 2007). The School AMPS skill items and tasks have been shown to be sensitive enough to differentiate between groups such that (1) the school motor and school process quality of schoolwork performance measures of typically developing students increase with age and (2) the mean school motor and school process quality of schoolwork performance measures for typically developing students are significantly higher than the mean quality of schoolwork performance measures for students with disabilities (at risk, mild, developmental/neurological, cognitive/psychological, and other multiple). Additional research has further confirmed that the mean school motor and school process quality of schoolwork performance measures are able to differentiate between typically developing students and students at risk (Fisher & Duran, 2004) and between typically developing students and students with mild disabilities (Munkholm & Fisher, 2008).
Procedures and Data Analyses
For this study, all School AMPS item calibration values were generated using an MFR computer software program (FACETS; Linacre, 2009). We performed two analyses (one for school motor and one for school process); in each of these analyses, the task challenges and rater severities were anchored at preestablished values, on the basis of the current School AMPS computer-scoring program (Three Star Press, 2005). More specifically, FACETS converts the ordinal school motor and school process raw item scores into linear (equal-interval) quality of schoolwork performance measures (one for school motor and one for school process quality of schoolwork performance) and, at the same time, computes item calibration values that define the hierarchical order of item difficulties. The school motor and school process quality of schoolwork performance measures and item difficulties are expressed in logits (log-odds probability units). Higher quality of schoolwork performance measures reflects more able students, and higher item calibration values represent easier skill items. See Fisher (1993)  for a more detailed explanation of MFR analyses.
When DIF analyses are specified in FACETS, item difficulties are estimated for the entire group (total sample) as well as for each group—in this case, each world region. Because the item difficulty calibrations for each group are positioned along the same linear scale, and the mean of each group’s item difficulty calibrations is set at zero, we can compare the item difficulty calibrations for each group directly by calculating the logit differences between regions. Also included in the results are t tests, which can be used to evaluate for statistical differences between item difficulty calibrations and region.
The results of these t tests, however, are not recommended as the sole criteria for determination of the presence of DIF because large sample sizes are typically associated with small standard errors (SE s), which increases the risk of too much power and overidentifying significant differences on the basis of p values alone (Wilson, 2005). The use of effect sizes, therefore, has become more common. Standards for important effect sizes, however, are lacking, but recommended values are typically 0.40–0.60 logit (Conrad, Dennis, Bezruczko, Funk, & Riley, 2007; Draba, 1977; Linacre, 1994; Tristán, 2006). The most strict values were developed by I. Paek (personal communication, January 7, 2009), who transformed Rasch-modeled logit differences into effect sizes on the basis of the Educational Testing Services (ETS) method for classifying DIF (Wilson, 2005). Paek found that a logit difference of <0.426 is negligible, 0.426–0.638 is slight to moderate, and >0.638 is moderate to severe. It remains common practice, however, to consider values <0.50 logit as evidence of no DIF (Draba, 1977; Tennant & Pallant, 2007; Tristán, 2006), because “measures based on item calibration[s] with random deviations up to 0.50 logit are ‘for all practical purposes free from bias’” (Linacre, 1994, p. 328). Conrad et al. (2007)  used a more liberal 0.60 logit criterion, based on Norman, Sloan, and Wyrwich (2003), who recommended the use of half of a standard deviation as an indicator of a clinically important difference.
Perhaps the most comprehensive analysis to determine a critical value for an important effect was implemented by Tristán (2006) . He developed a method for evaluating for significant DIF on the basis of the use of normalized SEs, where Image not available. Tristán found that when SEs were normalized, the minimum possible SE is 0.20 logit. With SE values of 0.20 logit, a difference in item calibration values of 0.55 logit would be required for statistical significance. That is, the statistical test for a significant difference in item calibration values is Image not available (Wright & Stone, 1979). Thus, 0.55 logit calibration difference/Image not available, p = .05.
We therefore chose to set our criteria for the presence of significant DIF on the basis of a logit difference of at least ±0.55 logit between regions. Thus, when the skill item hierarchies remain stable (i.e., the difference between regions is >−0.55 and <0.55 logit), there is no statistical DIF. If, however, the school motor and school process item difficulty calibrations differ statistically, there is a need to further investigate whether the identified DIF has a substantively meaningful impact on the measurement system. To determine whether any identified DIF resulted in a substantive impact, our plan was to evaluate for DTF. This is done by plotting student quality of schoolwork performance measures on the basis of the item difficulty calibrations for one region against the student quality of schoolwork performance measures on the basis of the item difficulty calibrations for another region and then evaluating whether the paired quality of schoolwork performance measures fall within a 95% confidence interval, indicating no evidence of DTF (Wright & Stone, 1979).
Results
The school motor and school process item calibration values for the entire (combined) sample and for each world region are listed in Table 2. Items with logit differences not falling within the acceptable range of −0.55 to 0.55 logit were Walks, Moves, and Endures on the school motor scale and Navigates on the school process scale. Logit differences in item difficulty calibrations between regions for the school motor items ranged from 0.62 to –0.57 logit (SE = 0.06–0.10 logit), and for the school process items, the differences in item difficulty calibrations between regions ranged from 0.46 to –0.56 logit (SE = 0.05–0.06 logit).
Table 2.
School Motor and School Process Skill Item Difficulty Calibration Measures (Logits)
School Motor and School Process Skill Item Difficulty Calibration Measures (Logits)×
World Regions (Combined)North AmericaUnited KingdomNordic CountriesAustralia and New Zealand
School Motor Skill Items
 Endures2.112.172.372.181.81
 Lifts1.561.381.561.551.72
 Transports1.291.441.261.351.16
 Reaches1.251.391.071.421.20
 Bends1.251.311.231.491.03
 Moves0.961.270.671.240.82
 Walks0.821.150.880.850.53
 Stabilizes0.140.120.130.37−0.06
 Aligns−0.67−0.52−0.78−0.77−0.61
 Grips−0.71−0.90−0.56−0.76−0.62
 Flows−1.02−1.04−0.94−1.31−0.82
 Coordinates−1.07−1.10−0.98−1.30−0.91
 Calibrates−1.33−1.41−1.34−1.43−1.13
 Manipulates−1.39−1.41−1.36−1.50−1.29
 Positions−1.56−1.57−1.55−1.57−1.55
 Paces−1.64−1.93−1.55−1.54−1.55
School Process Skill Items
 Chooses1.411.621.471.231.35
 Searches/Locates1.101.141.131.190.95
 Uses1.021.080.901.051.04
 Gathers0.940.861.040.880.97
 Inquires0.870.740.960.940.85
 Adjusts0.840.590.621.081.12
 Sequences0.820.920.840.710.82
 Navigates0.670.820.360.920.63
 Terminates−0.06−0.060.00−0.16−0.03
 Heeds−0.080.00−0.030.02−0.33
 Restores−0.12−0.16−0.02−0.34−0.01
 Organizes−0.18−0.24−0.20−0.15−0.12
 Handles−0.30−0.22−0.32−0.45−0.23
 Initiates−0.58−0.47−0.50−0.73−0.63
 Paces−0.58−0.66−0.54−0.57−0.55
 Continues−0.65−0.82−0.64−0.54−0.62
 Attends−0.77−0.90−0.69−0.73−0.77
 Notices/Responds−0.77−0.69−0.80−0.85−0.77
 Benefits−1.44−1.44−1.35−1.41−1.57
 Accommodates−2.11−1.98−2.06−2.19−2.20
Table Footer NoteNote. Item difficulty calibration values are listed in hierarchical order on the basis of the combined values. The higher the value is, the easier the item is.
Note. Item difficulty calibration values are listed in hierarchical order on the basis of the combined values. The higher the value is, the easier the item is.×
Table 2.
School Motor and School Process Skill Item Difficulty Calibration Measures (Logits)
School Motor and School Process Skill Item Difficulty Calibration Measures (Logits)×
World Regions (Combined)North AmericaUnited KingdomNordic CountriesAustralia and New Zealand
School Motor Skill Items
 Endures2.112.172.372.181.81
 Lifts1.561.381.561.551.72
 Transports1.291.441.261.351.16
 Reaches1.251.391.071.421.20
 Bends1.251.311.231.491.03
 Moves0.961.270.671.240.82
 Walks0.821.150.880.850.53
 Stabilizes0.140.120.130.37−0.06
 Aligns−0.67−0.52−0.78−0.77−0.61
 Grips−0.71−0.90−0.56−0.76−0.62
 Flows−1.02−1.04−0.94−1.31−0.82
 Coordinates−1.07−1.10−0.98−1.30−0.91
 Calibrates−1.33−1.41−1.34−1.43−1.13
 Manipulates−1.39−1.41−1.36−1.50−1.29
 Positions−1.56−1.57−1.55−1.57−1.55
 Paces−1.64−1.93−1.55−1.54−1.55
School Process Skill Items
 Chooses1.411.621.471.231.35
 Searches/Locates1.101.141.131.190.95
 Uses1.021.080.901.051.04
 Gathers0.940.861.040.880.97
 Inquires0.870.740.960.940.85
 Adjusts0.840.590.621.081.12
 Sequences0.820.920.840.710.82
 Navigates0.670.820.360.920.63
 Terminates−0.06−0.060.00−0.16−0.03
 Heeds−0.080.00−0.030.02−0.33
 Restores−0.12−0.16−0.02−0.34−0.01
 Organizes−0.18−0.24−0.20−0.15−0.12
 Handles−0.30−0.22−0.32−0.45−0.23
 Initiates−0.58−0.47−0.50−0.73−0.63
 Paces−0.58−0.66−0.54−0.57−0.55
 Continues−0.65−0.82−0.64−0.54−0.62
 Attends−0.77−0.90−0.69−0.73−0.77
 Notices/Responds−0.77−0.69−0.80−0.85−0.77
 Benefits−1.44−1.44−1.35−1.41−1.57
 Accommodates−2.11−1.98−2.06−2.19−2.20
Table Footer NoteNote. Item difficulty calibration values are listed in hierarchical order on the basis of the combined values. The higher the value is, the easier the item is.
Note. Item difficulty calibration values are listed in hierarchical order on the basis of the combined values. The higher the value is, the easier the item is.×
×
The school motor item Walks (diminished skill walking during schoolwork task performance) was easier for students in NA (1.15 logits) than students in ANZ(0.67 logit). Moves (diminished skill pushing or pulling task objects during schoolwork task performance) was harder for students in the UK (0.67 logit) than for those in NA or the Nordic countries (1.27 and 1.24 logits, respectively). Endures (obvious evidence of physical fatigue during schoolwork task performance) was easier for students in the UK (2.37 logits) than for students in ANZ (1.81 logits). Finally, the school process item Navigates (obviously bumping into obstacles when moving through space during schoolwork task performance) was harder for students in the UK (0.36 logit) than for students in the Nordic countries (0.92 logit).
When we further investigated for possible sources of DIF, we could not find any systematic pattern related to age, gender, diagnosis, task, rater, or region that could explain the DIF. Moreover, when the item calibration values for each region were compared with the item calibration values for the entire sample (all four regions combined), we found that none of the school motor and school process items displayed DIF. The differences in item difficulty calibrations ranged from 0.33 to –0.30 logit (SE = 0.06–0.12 logit) on the school motor scale and from 0.28 to –0.31 logit (SE = 0.05–0.07 logit) on the school process scale.
Finally, to evaluate for DTF, we plotted the students’ quality of schoolwork performance measures from one region against the student quality of schoolwork performance measures on the basis of the item difficulty calibration values for each of the other three regions. This strategy resulted in 12 comparisons, 6 for school motor and 6 for school process quality of performance measures. The results revealed that the student school motor and school process quality of performance measures for all four regions were similar (see Figure 2 for the most extreme example). Differences ranged from –0.16 to 0.16 logit. We concluded, therefore, that any significant DIF had no impact on the School AMPS measurement system for students from these four regions. That is, the School AMPS is free of substantive DTF among NA, ANZ, the UK, and the Nordic countries.
Figure 2.
Scatterplot of the school motor quality of performance measures based on Australia and New Zealand (ANZ) item calibration values compared with the school motor quality of performance measures based on North America (NA) item calibration values.
Note. The dashed lines represent 95% confidence interval control lines based on the average school motor standard error (mean = 0.29, standard deviation = 0.04).
Figure 2.
Scatterplot of the school motor quality of performance measures based on Australia and New Zealand (ANZ) item calibration values compared with the school motor quality of performance measures based on North America (NA) item calibration values.
Note. The dashed lines represent 95% confidence interval control lines based on the average school motor standard error (mean = 0.29, standard deviation = 0.04).
×
Discussion
The purpose of this study was to evaluate whether the School AMPS (Fisher et al., 2007) is valid for use to evaluate students in different world regions. In accordance with the MFR model of the School AMPS (Linacre, 1993), we proposed that easy items should be easier for students in all regions, and harder items should be harder for students in all regions. Thus, we hypothesized that the school motor and school process skill items would retain their hierarchical order of difficulty regardless of region where the students were evaluated. However, when each region was compared with the other three, we found the hierarchical order of difficulty for the school motor items Walks, Moves, and Endures, and the school process item Navigates to be unstable (logit difference ≤−0.55 or ≥0.55 logit). Yet, when the school motor and school process skill item difficulty calibrations for each region were compared with the reference values for the combined item difficulty calibrations of all four regions, all school motor and school process items remained free of DIF. We considered the latter comparison to be most important, because the School AMPS computer-scoring program (Three Star Press, 2005) is based on total sample calibrations. Consequently, when the occupational therapist is evaluating a student and enters the School AMPS skill item raw scores into the School AMPS computer-scoring program, the student’s quality of schoolwork performance measures are not affected by the world region in which he or she lives. Nevertheless, we investigated further for possible DTF and found no disruption of the measurement system despite minimal DIF among some school motor and school process skills. Considered together, our results add to already existing evidence that supports cross-regional use and comparisons of School AMPS results among NA, ANZ, the UK, and the Nordic countries.
Clinical Implications
One aspect of validity is related to test fairness and the meaningful interpretation of assessment data (Downing, 2003; Perrone, 2006). The results of this study provide support for occupational therapists using the School AMPS in NA, ANZ, the UK, and the Nordic countries in that the School AMPS measures are free of regional bias. Despite minimal DIF, it is clear that the School AMPS item calibration differences are not enough to affect the interpretation of the School AMPS test results of students from these four regions.
Limitations
A limitation of this study was that we were unable to identify the source of error. In part, this limitation was because only the students’ age, gender, diagnosis, tasks performed, rater, and world region of residence are stored in the School AMPS database. Thus, we were unable to investigate whether any other variables might have affected our findings. An additional limitation of this study was that we had only enough data from regions representative of Western cultures. Thus, it remains unknown whether our results can be generalized to non-Western cultures where the School AMPS is being used (e.g., Asia).
Recommendations for Future Research
The presence of minimal DIF, but no DTF, among world regions and no DIF when combined international item calibration values served as the primary reference values, support the use of the School AMPS when implementing future cross-cultural research. A need remains, however, to further examine for DIF among regions not included in this study (e.g., Asia, South America). Of particular interest would be future research that examines for DIF between Asian item calibration values versus the item calibration values for Western regions. Another recommendation for future research would be to determine whether differences exist in normative school motor and school process quality of performance means by region.
Acknowledgment
This study was supported by funding from the Swedish Research Council (Vetenskapsrådet).
References
Atchison, B. T., Fisher, A. G., & Bryze, K. (1998). Rater reliability and internal scale and person response validity of the School Assessment of Motor and Process Skills. American Journal of Occupational Therapy, 52, 843–850. [Article]
Atchison, B. T., Fisher, A. G., & Bryze, K. (1998). Rater reliability and internal scale and person response validity of the School Assessment of Motor and Process Skills. American Journal of Occupational Therapy, 52, 843–850. [Article] ×
Boomsma, A., van Duijn, M. A. J., & Snijders, T. A. B. (Eds.). 2001). Essays on item response theory. New York: Springer-Verlag.
Boomsma, A., van Duijn, M. A. J., & Snijders, T. A. B. (Eds.). 2001). Essays on item response theory. New York: Springer-Verlag.×
Conrad, K. J., Dennis, M. L., Bezruczko, N., Funk, R. R., & Riley, B. B. (2007). Substance use disorder symptoms: Evidence of differential item functioning by age. Journal of Applied Measurement, 8, 373–387. [PubMed]
Conrad, K. J., Dennis, M. L., Bezruczko, N., Funk, R. R., & Riley, B. B. (2007). Substance use disorder symptoms: Evidence of differential item functioning by age. Journal of Applied Measurement, 8, 373–387. [PubMed]×
Downing, S. M. (2003). Validity: On the meaningful interpretation of assessment data. Medical Education, 37, 830–837. doi: 10.1046/j.1365-2923.2003.01594.x [Article] [PubMed]
Downing, S. M. (2003). Validity: On the meaningful interpretation of assessment data. Medical Education, 37, 830–837. doi: 10.1046/j.1365-2923.2003.01594.x [Article] [PubMed]×
Draba, R. E. (1977). The identification and interpretation of item bias. MESA Memorandum No. 25. Retrieved April 8, 2010, from www.rasch.org/memo25.htm
Draba, R. E. (1977). The identification and interpretation of item bias. MESA Memorandum No. 25. Retrieved April 8, 2010, from www.rasch.org/memo25.htm×
Fisher, A. G. (1993). The assessment of IADL motor skills: An application of many-faceted Rasch analysis. American Journal of Occupational Therapy, 47, 319–329. [Article] [PubMed]
Fisher, A. G. (1993). The assessment of IADL motor skills: An application of many-faceted Rasch analysis. American Journal of Occupational Therapy, 47, 319–329. [Article] [PubMed]×
Fisher, A. G. (2006). Performance skills: Definitions and evaluation in the context of the Occupational Therapy Practice Framework. In H.Pendleton, & W.Schultz-Krohn (Eds.), Pedretti’s occupational therapy: Practice skills for physical dysfunction (6th ed., pp. 372–402). St. Louis, MO: Mosby.
Fisher, A. G. (2006). Performance skills: Definitions and evaluation in the context of the Occupational Therapy Practice Framework. In H.Pendleton, & W.Schultz-Krohn (Eds.), Pedretti’s occupational therapy: Practice skills for physical dysfunction (6th ed., pp. 372–402). St. Louis, MO: Mosby.×
Fisher, A. G., Bryze, K., & Atchison, B. T. (2000). Naturalistic assessment of functional performance in school-settings: Reliability and validity of the School AMPS scales. Journal of Outcome Measurement, 4, 491–512. [PubMed]
Fisher, A. G., Bryze, K., & Atchison, B. T. (2000). Naturalistic assessment of functional performance in school-settings: Reliability and validity of the School AMPS scales. Journal of Outcome Measurement, 4, 491–512. [PubMed]×
Fisher, A. G., Bryze, K., Hume, V., & Griswold, L. A. (2007). School AMPS: School Version of the Assessment of Motor and Process Skills (2nd ed.). Fort Collins, CO: Three Star Press.
Fisher, A. G., Bryze, K., Hume, V., & Griswold, L. A. (2007). School AMPS: School Version of the Assessment of Motor and Process Skills (2nd ed.). Fort Collins, CO: Three Star Press.×
Fisher, A. G., & Duran, G. A. (2004). Schoolwork task performance of students at risk of delays. Scandinavian Journal of Occupational Therapy, 11, 191–198. doi: 10.1080/11038120410003664 [Article]
Fisher, A. G., & Duran, G. A. (2004). Schoolwork task performance of students at risk of delays. Scandinavian Journal of Occupational Therapy, 11, 191–198. doi: 10.1080/11038120410003664 [Article] ×
Heward, W. L. (2000). Exceptional children: An introduction to special education (6th ed.). Upper Saddle River, NJ: Merrill.
Heward, W. L. (2000). Exceptional children: An introduction to special education (6th ed.). Upper Saddle River, NJ: Merrill.×
Linacre, J. M. (1993). Many-faceted Rasch measurement (2nd ed.). Chicago: MESA Press.
Linacre, J. M. (1993). Many-faceted Rasch measurement (2nd ed.). Chicago: MESA Press.×
Linacre, J. M. (1994). Sample size and item calibration (or person measure) stability. Rasch Measurement Transactions, 7, 328.
Linacre, J. M. (1994). Sample size and item calibration (or person measure) stability. Rasch Measurement Transactions, 7, 328.×
Linacre, J. M. (2009). FACETS: Rasch measurement computer program (Version 3.65.0) [Computer software]. Chicago: Winsteps.com
Linacre, J. M. (2009). FACETS: Rasch measurement computer program (Version 3.65.0) [Computer software]. Chicago: Winsteps.com×
Munkholm, M., & Fisher, A. G. (2008). Differences in schoolwork performance between typically-developing students and students with mild disabilities. OTJR: Occupation, Participation and Health, 28, 121–132. doi: 10.3928/15394492-20080601-06 [Article]
Munkholm, M., & Fisher, A. G. (2008). Differences in schoolwork performance between typically-developing students and students with mild disabilities. OTJR: Occupation, Participation and Health, 28, 121–132. doi: 10.3928/15394492-20080601-06 [Article] ×
Norman, G. R., Sloan, J. A., & Wyrwich, K. W. (2003). Interpretation of changes in health-related quality of life: The remarkable universality of half a standard deviation. Medical Care, 41, 582–592. doi: 10.1097/00005650-200305000-00004 [PubMed]
Norman, G. R., Sloan, J. A., & Wyrwich, K. W. (2003). Interpretation of changes in health-related quality of life: The remarkable universality of half a standard deviation. Medical Care, 41, 582–592. doi: 10.1097/00005650-200305000-00004 [PubMed]×
Pae, T., & Park, G. P. (2006). Examining the relationship between differential item functioning and differential test functioning. Language Testing, 23, 475–496. doi: 10.1191/0265532206lt338oa [Article]
Pae, T., & Park, G. P. (2006). Examining the relationship between differential item functioning and differential test functioning. Language Testing, 23, 475–496. doi: 10.1191/0265532206lt338oa [Article] ×
Perrone, M. (2006). Differential item functioning and item bias: Critical considerations in test fairness. Teachers College, Columbia University Working Papers in TESOL and Applied Linguistics, 6, 1–3.
Perrone, M. (2006). Differential item functioning and item bias: Critical considerations in test fairness. Teachers College, Columbia University Working Papers in TESOL and Applied Linguistics, 6, 1–3.×
Tennant, A., & Pallant, J. F. (2007). DIF matters: A practical approach to test if differential item functioning makes a difference. Rasch Measurement Transactions, 20, 1082–1084.
Tennant, A., & Pallant, J. F. (2007). DIF matters: A practical approach to test if differential item functioning makes a difference. Rasch Measurement Transactions, 20, 1082–1084.×
Three Star Press. (2005). School AMPS 2005 computer-scoring software. Fort Collins, CO: Author.
Three Star Press. (2005). School AMPS 2005 computer-scoring software. Fort Collins, CO: Author.×
Tristán, A. (2006). An adjustment for sample size in DIF analysis. Rasch Measurement Transactions, 20, 1070–1071.
Tristán, A. (2006). An adjustment for sample size in DIF analysis. Rasch Measurement Transactions, 20, 1070–1071.×
Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, NJ: Lawrence Erlbaum.
Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, NJ: Lawrence Erlbaum.×
Wright, B., & Stone, M. (1979). Best test design: Rasch measurement. Chicago: MESA Press.
Wright, B., & Stone, M. (1979). Best test design: Rasch measurement. Chicago: MESA Press.×
Wright, B., & Stone, M. (1999). Measurement essentials (2nd ed.). Wilmington, DE: Wide Range.
Wright, B., & Stone, M. (1999). Measurement essentials (2nd ed.). Wilmington, DE: Wide Range.×
Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, Ontario, Canada: Directorate of Human Resources Research and Evaluation, Department of National Defense.
Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, Ontario, Canada: Directorate of Human Resources Research and Evaluation, Department of National Defense.×
Figure 1.
Occupational performance skills included in the School Version of the Assessment of Motor and Process Skills (skill items).
Figure 1.
Occupational performance skills included in the School Version of the Assessment of Motor and Process Skills (skill items).
×
Figure 2.
Scatterplot of the school motor quality of performance measures based on Australia and New Zealand (ANZ) item calibration values compared with the school motor quality of performance measures based on North America (NA) item calibration values.
Note. The dashed lines represent 95% confidence interval control lines based on the average school motor standard error (mean = 0.29, standard deviation = 0.04).
Figure 2.
Scatterplot of the school motor quality of performance measures based on Australia and New Zealand (ANZ) item calibration values compared with the school motor quality of performance measures based on North America (NA) item calibration values.
Note. The dashed lines represent 95% confidence interval control lines based on the average school motor standard error (mean = 0.29, standard deviation = 0.04).
×
Table 1.
Description of the Sample Characteristics by World Region
Description of the Sample Characteristics by World Region×
North America (n = 246)United Kingdom (n = 246)Nordic Countries (n = 246)Australia and New Zealand (n = 246)Total (N = 984)
Gender
 Boys173170171174688
 Girls73767572296
Diagnosis groups
 Typically developing37373737148
 At risk1717171768
 Mild46464646184
 Developmental disabilities or neurologic disorders110110110110440
 Cognitive or psychiatric diagnoses999936
 Other or unknown diagnoses or multiple diagnoses27272727108
Table Footer NoteNote. Typically developing = students without disabilities; at risk = students at risk for academic or developmental delays, determined on the basis of teachers’ concerns; mild = mild educational disabilities (i.e., developmental coordination disorder, attention deficits, speech and language disorders). Mean age (SD) = 6.86 (2.3) years.
Note. Typically developing = students without disabilities; at risk = students at risk for academic or developmental delays, determined on the basis of teachers’ concerns; mild = mild educational disabilities (i.e., developmental coordination disorder, attention deficits, speech and language disorders). Mean age (SD) = 6.86 (2.3) years.×
Table 1.
Description of the Sample Characteristics by World Region
Description of the Sample Characteristics by World Region×
North America (n = 246)United Kingdom (n = 246)Nordic Countries (n = 246)Australia and New Zealand (n = 246)Total (N = 984)
Gender
 Boys173170171174688
 Girls73767572296
Diagnosis groups
 Typically developing37373737148
 At risk1717171768
 Mild46464646184
 Developmental disabilities or neurologic disorders110110110110440
 Cognitive or psychiatric diagnoses999936
 Other or unknown diagnoses or multiple diagnoses27272727108
Table Footer NoteNote. Typically developing = students without disabilities; at risk = students at risk for academic or developmental delays, determined on the basis of teachers’ concerns; mild = mild educational disabilities (i.e., developmental coordination disorder, attention deficits, speech and language disorders). Mean age (SD) = 6.86 (2.3) years.
Note. Typically developing = students without disabilities; at risk = students at risk for academic or developmental delays, determined on the basis of teachers’ concerns; mild = mild educational disabilities (i.e., developmental coordination disorder, attention deficits, speech and language disorders). Mean age (SD) = 6.86 (2.3) years.×
×
Table 2.
School Motor and School Process Skill Item Difficulty Calibration Measures (Logits)
School Motor and School Process Skill Item Difficulty Calibration Measures (Logits)×
World Regions (Combined)North AmericaUnited KingdomNordic CountriesAustralia and New Zealand
School Motor Skill Items
 Endures2.112.172.372.181.81
 Lifts1.561.381.561.551.72
 Transports1.291.441.261.351.16
 Reaches1.251.391.071.421.20
 Bends1.251.311.231.491.03
 Moves0.961.270.671.240.82
 Walks0.821.150.880.850.53
 Stabilizes0.140.120.130.37−0.06
 Aligns−0.67−0.52−0.78−0.77−0.61
 Grips−0.71−0.90−0.56−0.76−0.62
 Flows−1.02−1.04−0.94−1.31−0.82
 Coordinates−1.07−1.10−0.98−1.30−0.91
 Calibrates−1.33−1.41−1.34−1.43−1.13
 Manipulates−1.39−1.41−1.36−1.50−1.29
 Positions−1.56−1.57−1.55−1.57−1.55
 Paces−1.64−1.93−1.55−1.54−1.55
School Process Skill Items
 Chooses1.411.621.471.231.35
 Searches/Locates1.101.141.131.190.95
 Uses1.021.080.901.051.04
 Gathers0.940.861.040.880.97
 Inquires0.870.740.960.940.85
 Adjusts0.840.590.621.081.12
 Sequences0.820.920.840.710.82
 Navigates0.670.820.360.920.63
 Terminates−0.06−0.060.00−0.16−0.03
 Heeds−0.080.00−0.030.02−0.33
 Restores−0.12−0.16−0.02−0.34−0.01
 Organizes−0.18−0.24−0.20−0.15−0.12
 Handles−0.30−0.22−0.32−0.45−0.23
 Initiates−0.58−0.47−0.50−0.73−0.63
 Paces−0.58−0.66−0.54−0.57−0.55
 Continues−0.65−0.82−0.64−0.54−0.62
 Attends−0.77−0.90−0.69−0.73−0.77
 Notices/Responds−0.77−0.69−0.80−0.85−0.77
 Benefits−1.44−1.44−1.35−1.41−1.57
 Accommodates−2.11−1.98−2.06−2.19−2.20
Table Footer NoteNote. Item difficulty calibration values are listed in hierarchical order on the basis of the combined values. The higher the value is, the easier the item is.
Note. Item difficulty calibration values are listed in hierarchical order on the basis of the combined values. The higher the value is, the easier the item is.×
Table 2.
School Motor and School Process Skill Item Difficulty Calibration Measures (Logits)
School Motor and School Process Skill Item Difficulty Calibration Measures (Logits)×
World Regions (Combined)North AmericaUnited KingdomNordic CountriesAustralia and New Zealand
School Motor Skill Items
 Endures2.112.172.372.181.81
 Lifts1.561.381.561.551.72
 Transports1.291.441.261.351.16
 Reaches1.251.391.071.421.20
 Bends1.251.311.231.491.03
 Moves0.961.270.671.240.82
 Walks0.821.150.880.850.53
 Stabilizes0.140.120.130.37−0.06
 Aligns−0.67−0.52−0.78−0.77−0.61
 Grips−0.71−0.90−0.56−0.76−0.62
 Flows−1.02−1.04−0.94−1.31−0.82
 Coordinates−1.07−1.10−0.98−1.30−0.91
 Calibrates−1.33−1.41−1.34−1.43−1.13
 Manipulates−1.39−1.41−1.36−1.50−1.29
 Positions−1.56−1.57−1.55−1.57−1.55
 Paces−1.64−1.93−1.55−1.54−1.55
School Process Skill Items
 Chooses1.411.621.471.231.35
 Searches/Locates1.101.141.131.190.95
 Uses1.021.080.901.051.04
 Gathers0.940.861.040.880.97
 Inquires0.870.740.960.940.85
 Adjusts0.840.590.621.081.12
 Sequences0.820.920.840.710.82
 Navigates0.670.820.360.920.63
 Terminates−0.06−0.060.00−0.16−0.03
 Heeds−0.080.00−0.030.02−0.33
 Restores−0.12−0.16−0.02−0.34−0.01
 Organizes−0.18−0.24−0.20−0.15−0.12
 Handles−0.30−0.22−0.32−0.45−0.23
 Initiates−0.58−0.47−0.50−0.73−0.63
 Paces−0.58−0.66−0.54−0.57−0.55
 Continues−0.65−0.82−0.64−0.54−0.62
 Attends−0.77−0.90−0.69−0.73−0.77
 Notices/Responds−0.77−0.69−0.80−0.85−0.77
 Benefits−1.44−1.44−1.35−1.41−1.57
 Accommodates−2.11−1.98−2.06−2.19−2.20
Table Footer NoteNote. Item difficulty calibration values are listed in hierarchical order on the basis of the combined values. The higher the value is, the easier the item is.
Note. Item difficulty calibration values are listed in hierarchical order on the basis of the combined values. The higher the value is, the easier the item is.×
×