Free
Brief Report  |   March 2010
Assessment of Driving Performance Using a Simulator Protocol: Validity and Reproducibility
Author Affiliations
  • Michel Bédard, PhD, is Director, Centre for Research on Safe Driving, Lakehead University, Thunder Bay, Ontario; Northern Ontario School of Medicine, Thunder Bay, Ontario; St. Joseph’s Care Group, Thunder Bay, Ontario; and Public Health Program, Lakehead University, Room MS-2004, 955 Oliver Road, Thunder Bay, Ontario P7B 5E1, Canada; mbedard@lakeheadu.ca
  • Marie Parkkari, MSc, was Graduate Student, Department of Psychology, Centre for Research on Safe Driving, Lakehead University, Thunder Bay, Ontario, at the time of the study
  • Bruce Weaver, MSc, is Research Associate, Centre for Research on Safe Driving, Lakehead University, Thunder Bay, Ontario, and Northern Ontario School of Medicine, Thunder Bay
  • Julie Riendeau, MA, is Research Assistant, Centre for Research on Safe Driving, Lakehead University, Thunder Bay, Ontario
  • Mike Dahlquist was an Undergraduate Student, Centre for Research on Safe Driving, Lakehead University, Thunder Bay, Ontario
Article Information
Community Mobility and Driving / Departments / Brief Report
Brief Report   |   March 2010
Assessment of Driving Performance Using a Simulator Protocol: Validity and Reproducibility
American Journal of Occupational Therapy, March/April 2010, Vol. 64, 336-340. doi:10.5014/ajot.64.2.336
American Journal of Occupational Therapy, March/April 2010, Vol. 64, 336-340. doi:10.5014/ajot.64.2.336
Abstract

We examined the validity and reproducibility of simulator-based driving evaluations. In Study 1, we examined correlations among Trails A and B, demerit points for simulated drives, and simulator-recorded errors. With one exception, correlations ranged from .44 (p = .103) to .83 (p = .001). In Study 2, we examined correlations among Trail Making Test Part A, Useful Field of View, and demerit points for simulated drives; correlations ranged from .50 to .82 (all ps < .001). The correlation between demerit points for on-road and simulated drives was .74 (p = .035). We examined reproducibility of simulator assessments using the playback function; intraclass correlation coefficients ranged from .73 to .87 (all ps < .001). These results suggest that simulators could be used to facilitate the evaluation of fitness to drive.

Driving simulators offer many advantages over on-road tests. First, they provide a safe environment for the driver and the evaluator. Second, they enable presentation of situations that would not be available on the road or that would be too risky. Third, they make it possible to test everyone under the same conditions, regardless of weather or location. Yet, we are unlikely to see the adoption of simulators for clinical evaluations until their psychometric properties are well documented and found to be adequate.
One important and frequently studied psychometric property is validity. Although the evidence has suggested that simulators are not perfect surrogates for the on-road setting, research has made a strong case for the claim that people’s behavior on a simulator is similar to their behavior on the road (e.g., Bella, 2008; Godley, Triggs, & Fildes, 2002; Lee, Cameron, & Lee, 2004; Törnros, 1998; Yan, Abdel-Aty, Radwan, Wang, & Chilakapati, 2008). Even self-reports of driving behavior, which are often difficult to assess, can be predicted from simulated driving (Reimer, D’Ambrosio, Coughlin, Kafrissen, & Biederman, 2006).
Full acceptance of simulators, however, will rely on their value as a clinical tool. Data of this nature also support the use of simulators. A relationship exists between simulated driving and neuropsychological tests scores, as would be expected (Trail Making Test Part A, or Trails A; Wald & Liu, 2001). More important, simulators can distinguish between safe and unsafe drivers (Lee, Lee, Cameron, & Li-Tsang, 2003; Lew et al., 2005; Patomella, Tham, & Kottorp, 2006) and can be used to predict who has a greater risk of future crash involvement (Lee & Lee, 2005). In one 2008 study, the use of simulators by occupational therapists led to better agreement with medical staff than that obtained with neuropsychologists (Carroz, Comte, Nicolo, Dériaz, & Vuadens, 2008).
These results are encouraging, but further replication and extension of the findings, especially regarding the reproducibility of simulator-based evaluations, would strengthen confidence in simulators as clinical tools. In the studies reported here, we set out to expand the type of evidence supporting the validity of simulators as clinical tools. Specifically, we replicated the relationship between simulator data and neuropsychological and on-road data. We also examined the reproducibility of the simulator assessment by the same rater (intrarater reliability) and by a second independent rater (interrater reliability). Specifically, intrarater reliability is demonstrated when the same rater consistently obtains the same results when assessing the same people (assuming no important change took place between assessments). Interrater reliability occurs when two independent raters obtain similar results. Interrater reliability is especially important. We demonstrated adequate intra- and interrater reliability.
General Method
Participant Recruitment
Participants were recruited through various means. Undergraduate students were typically recruited through first-year psychology classes. Other drivers were recruited through advertising on campus, through newspapers, at seniors’ centers, and from previous studies we have conducted. Further details on the samples are provided in the appropriate sections. The studies were approved by the university research ethics board, and all participants provided informed consent.
Testing
The neuropsychological testing was done with paper-and-pencil tests or using computers to assess attention and perception processes. We used the Trails A and B (Corrigan & Hinkeldey, 1987) and the Useful Field of View® (UFOV; Ball & Owsley, 1993) tests. Our on-road evaluation consists of a standardized city circuit evaluated by a trained driving instructor. The outcome measure is the number of demerit points accumulated (higher is worse; see Bédard et al., 2008, for more details). The simulated drive was programmed on a STISIM model 400 (Systems Technology Incorporated, Hawthorne, CA) using a 135° field of view (on three computer monitors) to approximate the actual on-road circuit. Two measures were recorded during the simulated drive: (1) a research assistant recorded the number of demerit points (using the same instrument that was used for the on-road evaluation), and (2) the simulator automatically recorded the number of driving mistakes. Mistakes included center-line crossing, road edge excursion, failure to stop at a stop sign or red light, speeding (>2.5 mph over the speed limit), illegal turns, off-road crashes, and vehicle collisions. The sum of frequency counts for these measures is reported as the total simulator-recorded errors.
Method and Results
Study 1
Method.
Study 1 used a convenience sample (N = 15) recruited from the introductory psychology classes. All participants completed the Trails A (15 items) and B (25 items) and the driving simulation protocol for which we computed the demerit points and the simulated-recorded errors.
Results.
Our sample consisted of 8 men (53%) and 7 women (47%). The mean age was 20.41 (SD = 2.20). Descriptive data on Trails A and B, demerit points, and simulator-recorded errors are presented in Table 1. The correlations among the variables are presented in Table 2. With the exception of the correlation between the Trails B and simulator-recorded errors (r = .10), the correlations were in the moderate to strong range, although not all reached statistical significance given the small sample size.
Table 1.
Descriptive Statistics for the Variables Recorded in Study 1 (N = 15)
Descriptive Statistics for the Variables Recorded in Study 1 (N = 15)×
VariableMinimumMaximumMean (SD)
Time (s) for Trails A7.027.017.30 (5.79)
Time (s) for Trails B36.5143.068.13 (27.28)
Demerit points1531582.00 (80.37)
Simulator-recorded errors15817.93 (14.29)
Table Footer NoteNote. SD = standard deviation; Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.
Note. SD = standard deviation; Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.×
Table 1.
Descriptive Statistics for the Variables Recorded in Study 1 (N = 15)
Descriptive Statistics for the Variables Recorded in Study 1 (N = 15)×
VariableMinimumMaximumMean (SD)
Time (s) for Trails A7.027.017.30 (5.79)
Time (s) for Trails B36.5143.068.13 (27.28)
Demerit points1531582.00 (80.37)
Simulator-recorded errors15817.93 (14.29)
Table Footer NoteNote. SD = standard deviation; Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.
Note. SD = standard deviation; Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.×
×
Table 2.
Pearson Correlation Coefficients Between the Variables From Study 1 (N = 15)
Pearson Correlation Coefficients Between the Variables From Study 1 (N = 15)×
VariableTrails A r(p)Trails B r(p)Demerit Points r(p)
Trails A
Trails B.48 (.068)
Demerit points.74 (.002).44 (.103)
Simulator-recorded errors.60 (.018).10 (.717).83 (.001)
Table Footer NoteNote. Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.
Note. Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.×
Table 2.
Pearson Correlation Coefficients Between the Variables From Study 1 (N = 15)
Pearson Correlation Coefficients Between the Variables From Study 1 (N = 15)×
VariableTrails A r(p)Trails B r(p)Demerit Points r(p)
Trails A
Trails B.48 (.068)
Demerit points.74 (.002).44 (.103)
Simulator-recorded errors.60 (.018).10 (.717).83 (.001)
Table Footer NoteNote. Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.
Note. Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.×
×
Study 2
Method.
This sample (n = 50) was recruited specifically to have a cross-section of various ages. All participants completed the Trails A (15 items) and the UFOV. Thirty-eight participants were able to complete the simulated drive for which we report the demerit points (the drive was terminated prematurely for the remaining participants because of discomfort). A subsample of participants ages ≥65 (n = 8) also completed the on-road evaluation, which was rated by a different evaluator who is licensed for ministry-approved on-road evaluations. Data from all 38 participants who completed the simulated drive were used to examine the reproducibility of the simulator evaluations. Specifically, we examined whether the same evaluator (Parkkari) and a second independent rater could reliably score the demerit points using the playback function of the simulator (the simulated run was played only once and at least 1 mo elapsed between the live evaluation and playback evaluations).
To quantify reproducibility, we used several approaches. We first examined correlations between ratings using Pearson r. However, when systematic differences exist between raters (e.g., one rater rates consistently higher or lower than the other), the Pearson correlation coefficient will be unaffected by the systematic difference and may provide an inflated indication of reproducibility. Therefore, in addition to the correlation coefficient, we calculated the intraclass correlation coefficient (ICC). The ICC is more sensitive to systematic differences between raters than Pearson correlation coefficients; ICC values of ≥.7 generally denote acceptable reproducibility (Bédard, Martin, Krueger, & Brazil, 2000). We used the approach based on a random model because our raters were considered a random sample of all possible raters. We also compared the means between raters to identify systematic differences. The assessment protocol we used provides a total demerit points score and scores on five subscales; we also examined reproducibility for those subscales to obtain a comprehensive picture.
Results.
A total of 21 men (42%) and 29 women (58%) participated. The participants’ ages ranged from 18 to 83. Descriptive statistics on the variables studied are presented in Table 3 (all analyses including simulator data are based on the sample of 38 participants who completed the drive). We present the correlations among the variables in Table 4. As we observed in Study 1, most correlations were in the moderate to strong range. The subset of 8 older participants (2 men and 6 women) for whom we also collected on-road evaluation results ranged in age from 67 to 81 with a mean age of 67.4 (SD = 5.3). The correlation between the on-road and simulator-based demerit points was .74 (p = .035).
Table 3.
Descriptive Statistics for the Variables Recorded in Study 2 (N5 50 for Trails A and UFOV; n = 38 for Demerit Points)
Descriptive Statistics for the Variables Recorded in Study 2 (N5 50 for Trails A and UFOV; n = 38 for Demerit Points)×
VariableMinimumMaximumMean (SD)
Trails A7.9055.0023.19 (11.68)
UFOV Subtest 1177022 (13)
UFOV Subtest 217500120 (157)
UFOV Subtest 323500225 (182)
Demerit points30235114 (49)
Table Footer NoteNote. UFOV = Useful Field of View test; Trails A = Trail Making Test Part A; SD = standard deviation.
Note. UFOV = Useful Field of View test; Trails A = Trail Making Test Part A; SD = standard deviation.×
Table 3.
Descriptive Statistics for the Variables Recorded in Study 2 (N5 50 for Trails A and UFOV; n = 38 for Demerit Points)
Descriptive Statistics for the Variables Recorded in Study 2 (N5 50 for Trails A and UFOV; n = 38 for Demerit Points)×
VariableMinimumMaximumMean (SD)
Trails A7.9055.0023.19 (11.68)
UFOV Subtest 1177022 (13)
UFOV Subtest 217500120 (157)
UFOV Subtest 323500225 (182)
Demerit points30235114 (49)
Table Footer NoteNote. UFOV = Useful Field of View test; Trails A = Trail Making Test Part A; SD = standard deviation.
Note. UFOV = Useful Field of View test; Trails A = Trail Making Test Part A; SD = standard deviation.×
×
Table 4.
Pearson Correlation Coefficients Between the Variables From Study 2 (N = 50 forTrailsAand UFOV; N = 38 for Demerit Points)
Pearson Correlation Coefficients Between the Variables From Study 2 (N = 50 forTrailsAand UFOV; N = 38 for Demerit Points)×
Variable1234
1. Trails A
2. UFOV-1.51
3. UFOV-2.80.59
4. UFOV-3.81.50.82
Demerit points.57.61.67.72
Table Footer NoteNote. Trails A = Trail Making Test Part A; UFOV = Useful Field of View test. All ps < .001.
Note. Trails A = Trail Making Test Part A; UFOV = Useful Field of View test. All ps < .001.×
Table 4.
Pearson Correlation Coefficients Between the Variables From Study 2 (N = 50 forTrailsAand UFOV; N = 38 for Demerit Points)
Pearson Correlation Coefficients Between the Variables From Study 2 (N = 50 forTrailsAand UFOV; N = 38 for Demerit Points)×
Variable1234
1. Trails A
2. UFOV-1.51
3. UFOV-2.80.59
4. UFOV-3.81.50.82
Demerit points.57.61.67.72
Table Footer NoteNote. Trails A = Trail Making Test Part A; UFOV = Useful Field of View test. All ps < .001.
Note. Trails A = Trail Making Test Part A; UFOV = Useful Field of View test. All ps < .001.×
×
Regarding the reproducibility data, three main correlations were of interest. For the test–retest situation (Rater 1, initial live evaluation vs. playback), the correlation coefficient was .83 (p = .001). The correlation between the initial evaluation by Rater 1 and the playback evaluation by Rater 2 was .79 (p = .001). Finally, the correlation between the playback results for Raters 1 and 2 was .87 (p = .001). The corresponding ICCs were .76 (95% confidence interval [CI] = 0.58–0.87), .73 (95% CI = 0.53–0.85), and .87 (95% CI = 0.77–0.93). The lower values obtained with the ICC for the first two situations suggest a systematic difference between ratings. The t tests performed substantiated this observation (see Figure 1; the solid lines represent the regression line, the dashed lines represent perfect agreement).
Figure 1.
These scattergrams depict the relationship between ratings of the simulated drives.
Note. ICC = intraclass correlation coefficient.
Figure 1.
These scattergrams depict the relationship between ratings of the simulated drives.
Note. ICC = intraclass correlation coefficient.
×
The examination of reproducibility at the subscale level provides further information. These data suggest that a greater level of discrepancy exists at the subscale level than across total scores. The ICCs ranged from .32 to .83; only five reached the .7 threshold. Moreover, the t tests provided evidence of discrepancies between ratings (Table 5).
Table 5.
Reproducibility Statistics
Reproducibility Statistics×
SubscalePearson r(p)ICC (95% CI)Mean Differencet-Test Result (p)
1. Starting, stopping, and backing
 Rater 1 (live) vs. Rater 1 (playback).69 (<.001).59 (0.34-0.77)−11.35.81 (<.001)
 Rater 1 (live) vs. Rater 2 (playback).63 (<.001).58 (0.32-0.76)−6.33.50 (.001)
 Rater 1 (playback) vs. Rater 2 (playback).84 (<.001).83(0.70-0.91)5.03.45 (.001)
2. Signal violations, right-of-way, and inattention
 Rater 1 (live) vs. Rater 1 (playback).32 (.051).32 (0.00-0.58)−1.71.07 (.293)
 Rater 1 (live) vs. Rater 2 (playback).66 (<.001).65 (0.42-0.80)−2.11.69 (.099)
 Rater 1 (playback) vs. Rater 2 (playback).66 (<.001).65 (0.42-0.80)−0.40.32 (.750)
3. Moving in the roadway
 Rater 1 (live) vs. Rater 1 (playback).66 (<.001).62 (0.38-0.78)0.80.27 (.786)
 Rater 1 (live) vs. Rater 2 (playback).55 (<.001).55 (0.28-0.74)4.11.48 (.148)
 Rater 1 (playback) vs. Rater 2 (playback).82 (<.001).80 (0.65-0.89)3.31.51 (.141)
4. Passing and speed
 Rater 1 (live) vs. Rater 1 (playback).65 (<.001).63 (0.39-0.79)−13.24.02 (<.001)
 Rater 1 (live) vs. Rater 2 (playback).78(<.001).73 (0.54-0.85)−3.71.22 (.230)
 Rater 1 (playback) vs. Rater 2 (playback).72(<.001).71 (0.51-0.84)9.52.78 (.009)
5. Turning
 Rater 1 (live) vs. Rater 1 (playback).64 (<.001).60 (0.35-0.77)−20.96.99 (<.001)
 Rater 1 (live) vs. Rater 2 (playback).45 (.051).41 (0.11-0.65)−22.05.91 (<.001)
 Rater 1 (playback) vs. Rater 2 (playback).71 (<.001).71 (0.50-0.84)−1.00.35 (.730)
Table Footer NoteNote. ICC = intraclass correlation coefficient; CI = confidence interval.
Note. ICC = intraclass correlation coefficient; CI = confidence interval.×
Table 5.
Reproducibility Statistics
Reproducibility Statistics×
SubscalePearson r(p)ICC (95% CI)Mean Differencet-Test Result (p)
1. Starting, stopping, and backing
 Rater 1 (live) vs. Rater 1 (playback).69 (<.001).59 (0.34-0.77)−11.35.81 (<.001)
 Rater 1 (live) vs. Rater 2 (playback).63 (<.001).58 (0.32-0.76)−6.33.50 (.001)
 Rater 1 (playback) vs. Rater 2 (playback).84 (<.001).83(0.70-0.91)5.03.45 (.001)
2. Signal violations, right-of-way, and inattention
 Rater 1 (live) vs. Rater 1 (playback).32 (.051).32 (0.00-0.58)−1.71.07 (.293)
 Rater 1 (live) vs. Rater 2 (playback).66 (<.001).65 (0.42-0.80)−2.11.69 (.099)
 Rater 1 (playback) vs. Rater 2 (playback).66 (<.001).65 (0.42-0.80)−0.40.32 (.750)
3. Moving in the roadway
 Rater 1 (live) vs. Rater 1 (playback).66 (<.001).62 (0.38-0.78)0.80.27 (.786)
 Rater 1 (live) vs. Rater 2 (playback).55 (<.001).55 (0.28-0.74)4.11.48 (.148)
 Rater 1 (playback) vs. Rater 2 (playback).82 (<.001).80 (0.65-0.89)3.31.51 (.141)
4. Passing and speed
 Rater 1 (live) vs. Rater 1 (playback).65 (<.001).63 (0.39-0.79)−13.24.02 (<.001)
 Rater 1 (live) vs. Rater 2 (playback).78(<.001).73 (0.54-0.85)−3.71.22 (.230)
 Rater 1 (playback) vs. Rater 2 (playback).72(<.001).71 (0.51-0.84)9.52.78 (.009)
5. Turning
 Rater 1 (live) vs. Rater 1 (playback).64 (<.001).60 (0.35-0.77)−20.96.99 (<.001)
 Rater 1 (live) vs. Rater 2 (playback).45 (.051).41 (0.11-0.65)−22.05.91 (<.001)
 Rater 1 (playback) vs. Rater 2 (playback).71 (<.001).71 (0.50-0.84)−1.00.35 (.730)
Table Footer NoteNote. ICC = intraclass correlation coefficient; CI = confidence interval.
Note. ICC = intraclass correlation coefficient; CI = confidence interval.×
×
Discussion
The studies reported here replicate prior findings (e.g., Lee et al., 2004; Wald & Liu, 2001) and add to the literature in several ways. First, the relationship between the simulator data and neuropsychological tests was as expected. Specifically, a moderate to strong relationship exists between the performance assessed on the simulator and neuropsychological tests that are known to predict safe driving and crashes. Second, we demonstrated that assessment of driving performance on the simulator (based on demerit points) is related to the number of errors recorded by the simulator. Third, we showed that demerit points recorded during an on-road evaluation are related to demerit points recorded during a simulated drive. Finally, the number of demerit points reported can be reproduced by the same evaluator, or a different one, using the playback function of the simulator.
Our results, combined with those of others (e.g., Carroz et al., 2008), have several clinical implications. First, the results increase our confidence in the simulator as a tool that approximates driving performance as indexed with traditional approaches such as neuropsychological testing and on-road evaluations. This finding suggests, at the very least, that simulator data, whether based on an evaluator’s rating or simulator-recorded measures, can be added to other approaches to provide a comprehensive picture of someone’s driving performance. Note, however, that simulator-recorded errors do not at this time weight the severity of the errors and that preventive behaviors (e.g., scanning properly at intersections) are not captured.
The demonstration that we can reproduce the results of simulator evaluations using the playback function also has clinical implications. First, it shows that a rater does not have to be present during the actual drive to provide an accurate assessment. The test drive could be done by the participant alone, possibly enhancing the ecological validity of the situation. It also suggests that the actual recorded drive could be evaluated at a later time, even possibly in a different location, if it can be forwarded electronically. Second, the reproducibility of the evaluation by a second, independent rater confirms that the approach meets another important psychometric property required for the clinical determination of driving performance. Although we spent minimal time to train the raters, the results confirm that independent observers can, by observing the replay of the simulated drive, have convergent results. Future work regarding reproducibility of findings between raters should include better calibration of evaluators through the development of standardized training and evaluation protocols. This finding is especially important in light of the variable agreement obtained when we examined the subscales. Although the overall results are indicative of good agreement, our findings suggest that better concordance at the subscale level could lead to even better agreement.
The findings also raise questions regarding the on-road evaluation as the “gold standard.” It is reasonable to postulate that complex simulated protocols could replace on-road evaluations. This proposition has many advantages. First, the safety inherent in the simulated environment provides advantages from an injury prevention perspective and from a comprehensive testing point of view. Both participants and raters are safer in a simulated setting than on the road. Moreover, the simulator allows testing in difficult situations (e.g., low visibility) and where emergency or avoidance maneuvers may be required. This environment would further enhance the results of driver testing by documenting performance in challenging situations, and some researchers have suggested that such situations should be included in typical comprehensive testing for older drivers (Bieliauskas, 2005).
A second advantage relates to the logistical and economical aspects of driver evaluation. On-road testing is potentially more difficult to set up and may, over time, represent a more costly approach. With increasingly affordable simulators becoming available and the rising costs of fuel, car purchase and maintenance, and insurance premiums, simulators may represent a cost-effective approach to on-road driver testing.
Replacing the on-road evaluation with simulated drives has disadvantages. The issue of ecological validity will remain at the forefront no matter how realistic simulators become. The inability of some participants to complete the simulated drives because of discomfort also raises concerns. Our study is no exception. Twenty-five percent of participants did not complete the simulated drive. It is unclear whether we can fully resolve this problem, but further research will certainly help to minimize the proportion of people who cannot successfully complete a simulated drive.
Our study adds to many encouraging reports available in the literature. Although we were limited to a convenience sample and had evaluators with only limited training, we believe that our data support the use of simulators for clinical assessment of drivers. Further research to elaborate on this potential, especially with clinical populations typically assessed by occupational therapists, would assist in ensuring that our findings can be generalized to different populations of drivers. More structured training and testing protocols are needed and are a must if we are to translate the research into the clinical arena.
Acknowledgments
We thank all the participants for their invaluable contribution to this work. Funding was provided through research grants from AUTO21-Network of Centres of Excellence, CanDRIVE (a New Emerging Team funded by the Canadian Institutes of Health Research, Institute of Aging), and the Thunder Bay Foundation. Michel Bédard is a Canada Research Chair in Aging and Health (www.chairs.gc.ca); he acknowledges the support of the program.
References
Ball, K., & Owsley, C. (1993). The Useful Field of View test: A new technique for evaluating age-related declines in visual function. Journal of the American Optometric Association, 64, 71–79. [PubMed]
Ball, K., & Owsley, C. (1993). The Useful Field of View test: A new technique for evaluating age-related declines in visual function. Journal of the American Optometric Association, 64, 71–79. [PubMed]×
Bédard, M., Martin, N. J., Krueger, P., & Brazil, K. (2000). Assessing reproducibility of data obtained with instruments based on continuous measurements. Experimental Aging Research, 26, 353–365. [Article] [PubMed]
Bédard, M., Martin, N. J., Krueger, P., & Brazil, K. (2000). Assessing reproducibility of data obtained with instruments based on continuous measurements. Experimental Aging Research, 26, 353–365. [Article] [PubMed]×
Bédard, M., Porter, M. M., Marshall, S., Isherwood, I., Riendeau, J., Weaver, B., et al. (2008). The combination of two training approaches to improve older adults’ driving safety. Traffic Injury Prevention, 9, 70–76. [Article] [PubMed]
Bédard, M., Porter, M. M., Marshall, S., Isherwood, I., Riendeau, J., Weaver, B., et al. (2008). The combination of two training approaches to improve older adults’ driving safety. Traffic Injury Prevention, 9, 70–76. [Article] [PubMed]×
Bella, F. (2008). Driving simulator for speed research on two-lane rural roads. Accident Analysis and Prevention, 40, 1078–1087. [Article] [PubMed]
Bella, F. (2008). Driving simulator for speed research on two-lane rural roads. Accident Analysis and Prevention, 40, 1078–1087. [Article] [PubMed]×
Bieliauskas, L. A. (2005). Neuropsychological assessment of geriatric driving competence. Brain Injury, 19, 221–226. [Article] [PubMed]
Bieliauskas, L. A. (2005). Neuropsychological assessment of geriatric driving competence. Brain Injury, 19, 221–226. [Article] [PubMed]×
Carroz, A., Comte, P. A., Nicolo, D., Dériaz, O., & Vuadens, P. (2008). Intérêt du simulateur de conduite pour la reprise de la conduite automobile en situation de handicap [Relevance of a driving simulator in the assessment of handicapped individuals]. Annales de Readaptation et de Medecine Physique, 51, 358–365. [Article] [PubMed]
Carroz, A., Comte, P. A., Nicolo, D., Dériaz, O., & Vuadens, P. (2008). Intérêt du simulateur de conduite pour la reprise de la conduite automobile en situation de handicap [Relevance of a driving simulator in the assessment of handicapped individuals]. Annales de Readaptation et de Medecine Physique, 51, 358–365. [Article] [PubMed]×
Corrigan, J. D., & Hinkeldey, N. S. (1987). Relationships between parts A and B of the Trail Making Test. Journal of Clinical Psychology, 43, 402–409. [Article] [PubMed]
Corrigan, J. D., & Hinkeldey, N. S. (1987). Relationships between parts A and B of the Trail Making Test. Journal of Clinical Psychology, 43, 402–409. [Article] [PubMed]×
Godley, S. T., Triggs, T. J., & Fildes, B. (2002). Driving simulator validation for speed research. Accident Analysis and Prevention, 34, 589–600. [Article] [PubMed]
Godley, S. T., Triggs, T. J., & Fildes, B. (2002). Driving simulator validation for speed research. Accident Analysis and Prevention, 34, 589–600. [Article] [PubMed]×
Lee, H. C., Cameron, D., & Lee, A. H. (2004). Assessing the driving performance of older adult drivers: On-road versus simulated driving. Accident Analysis and Prevention, 35, 797–803. [Article]
Lee, H. C., Cameron, D., & Lee, A. H. (2004). Assessing the driving performance of older adult drivers: On-road versus simulated driving. Accident Analysis and Prevention, 35, 797–803. [Article] ×
Lee, H. C., & Lee, A. H. (2005). Identifying older drivers at risk of traffic violations by using a driving simulator: A 3-year longitudinal study. American Journal of Occupational Therapy, 59, 97–100. [Article] [PubMed]
Lee, H. C., & Lee, A. H. (2005). Identifying older drivers at risk of traffic violations by using a driving simulator: A 3-year longitudinal study. American Journal of Occupational Therapy, 59, 97–100. [Article] [PubMed]×
Lee, H. C., Lee, A. H., Cameron, D., & Li-Tsang, C. (2003). Using a driving simulator to identify older drivers at inflated risk of motor vehicle crashes. Journal of Safety Research, 34, 453–459. [Article] [PubMed]
Lee, H. C., Lee, A. H., Cameron, D., & Li-Tsang, C. (2003). Using a driving simulator to identify older drivers at inflated risk of motor vehicle crashes. Journal of Safety Research, 34, 453–459. [Article] [PubMed]×
Lew, H. L., Poole, J. H., Lee, E. H., Jaffe, D. L., Huang, H. C., & Brodd, E. (2005). Predictive validity of driving-simulator assessments following traumatic brain injury: A preliminary study. Brain Injury, 19, 177–188. [Article] [PubMed]
Lew, H. L., Poole, J. H., Lee, E. H., Jaffe, D. L., Huang, H. C., & Brodd, E. (2005). Predictive validity of driving-simulator assessments following traumatic brain injury: A preliminary study. Brain Injury, 19, 177–188. [Article] [PubMed]×
Patomella, A.-H., Tham, K., & Kottorp, A. (2006). P-drive: Assessment of driving performance after stroke. Journal of Rehabilitation Medicine, 38, 273–279. [Article] [PubMed]
Patomella, A.-H., Tham, K., & Kottorp, A. (2006). P-drive: Assessment of driving performance after stroke. Journal of Rehabilitation Medicine, 38, 273–279. [Article] [PubMed]×
Reimer, B., D’Ambrosio, L. A., Coughlin, J. E., Kafrissen, M. E., & Biederman, J. (2006). Using self-reported data to assess the validity of driving simulation data. Behavior Research Methods, 38, 314–324. [Article] [PubMed]
Reimer, B., D’Ambrosio, L. A., Coughlin, J. E., Kafrissen, M. E., & Biederman, J. (2006). Using self-reported data to assess the validity of driving simulation data. Behavior Research Methods, 38, 314–324. [Article] [PubMed]×
Törnros, J. (1998). Driving behavior in a real and a simulated road tunnel—A validation study. Accident Analysis and Prevention, 30, 497–503. [Article] [PubMed]
Törnros, J. (1998). Driving behavior in a real and a simulated road tunnel—A validation study. Accident Analysis and Prevention, 30, 497–503. [Article] [PubMed]×
Wald, J., & Liu, L. (2001). Psychometric properties of the driVR: A virtual reality driving assessment. Studies in Health Technology and Informatics, 81, 564–566. [PubMed]
Wald, J., & Liu, L. (2001). Psychometric properties of the driVR: A virtual reality driving assessment. Studies in Health Technology and Informatics, 81, 564–566. [PubMed]×
Yan, X., Abdel-Aty, M., Radwan, E., Wang, X., & Chilakapati, P. (2008). Validating a driving simulator using surrogate safety measures. Accident Analysis and Prevention, 40, 274–288. [Article] [PubMed]
Yan, X., Abdel-Aty, M., Radwan, E., Wang, X., & Chilakapati, P. (2008). Validating a driving simulator using surrogate safety measures. Accident Analysis and Prevention, 40, 274–288. [Article] [PubMed]×
Figure 1.
These scattergrams depict the relationship between ratings of the simulated drives.
Note. ICC = intraclass correlation coefficient.
Figure 1.
These scattergrams depict the relationship between ratings of the simulated drives.
Note. ICC = intraclass correlation coefficient.
×
Table 1.
Descriptive Statistics for the Variables Recorded in Study 1 (N = 15)
Descriptive Statistics for the Variables Recorded in Study 1 (N = 15)×
VariableMinimumMaximumMean (SD)
Time (s) for Trails A7.027.017.30 (5.79)
Time (s) for Trails B36.5143.068.13 (27.28)
Demerit points1531582.00 (80.37)
Simulator-recorded errors15817.93 (14.29)
Table Footer NoteNote. SD = standard deviation; Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.
Note. SD = standard deviation; Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.×
Table 1.
Descriptive Statistics for the Variables Recorded in Study 1 (N = 15)
Descriptive Statistics for the Variables Recorded in Study 1 (N = 15)×
VariableMinimumMaximumMean (SD)
Time (s) for Trails A7.027.017.30 (5.79)
Time (s) for Trails B36.5143.068.13 (27.28)
Demerit points1531582.00 (80.37)
Simulator-recorded errors15817.93 (14.29)
Table Footer NoteNote. SD = standard deviation; Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.
Note. SD = standard deviation; Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.×
×
Table 2.
Pearson Correlation Coefficients Between the Variables From Study 1 (N = 15)
Pearson Correlation Coefficients Between the Variables From Study 1 (N = 15)×
VariableTrails A r(p)Trails B r(p)Demerit Points r(p)
Trails A
Trails B.48 (.068)
Demerit points.74 (.002).44 (.103)
Simulator-recorded errors.60 (.018).10 (.717).83 (.001)
Table Footer NoteNote. Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.
Note. Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.×
Table 2.
Pearson Correlation Coefficients Between the Variables From Study 1 (N = 15)
Pearson Correlation Coefficients Between the Variables From Study 1 (N = 15)×
VariableTrails A r(p)Trails B r(p)Demerit Points r(p)
Trails A
Trails B.48 (.068)
Demerit points.74 (.002).44 (.103)
Simulator-recorded errors.60 (.018).10 (.717).83 (.001)
Table Footer NoteNote. Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.
Note. Trails A = Trail Making Test Part A; Trails B = Trail Making Test Part B.×
×
Table 3.
Descriptive Statistics for the Variables Recorded in Study 2 (N5 50 for Trails A and UFOV; n = 38 for Demerit Points)
Descriptive Statistics for the Variables Recorded in Study 2 (N5 50 for Trails A and UFOV; n = 38 for Demerit Points)×
VariableMinimumMaximumMean (SD)
Trails A7.9055.0023.19 (11.68)
UFOV Subtest 1177022 (13)
UFOV Subtest 217500120 (157)
UFOV Subtest 323500225 (182)
Demerit points30235114 (49)
Table Footer NoteNote. UFOV = Useful Field of View test; Trails A = Trail Making Test Part A; SD = standard deviation.
Note. UFOV = Useful Field of View test; Trails A = Trail Making Test Part A; SD = standard deviation.×
Table 3.
Descriptive Statistics for the Variables Recorded in Study 2 (N5 50 for Trails A and UFOV; n = 38 for Demerit Points)
Descriptive Statistics for the Variables Recorded in Study 2 (N5 50 for Trails A and UFOV; n = 38 for Demerit Points)×
VariableMinimumMaximumMean (SD)
Trails A7.9055.0023.19 (11.68)
UFOV Subtest 1177022 (13)
UFOV Subtest 217500120 (157)
UFOV Subtest 323500225 (182)
Demerit points30235114 (49)
Table Footer NoteNote. UFOV = Useful Field of View test; Trails A = Trail Making Test Part A; SD = standard deviation.
Note. UFOV = Useful Field of View test; Trails A = Trail Making Test Part A; SD = standard deviation.×
×
Table 4.
Pearson Correlation Coefficients Between the Variables From Study 2 (N = 50 forTrailsAand UFOV; N = 38 for Demerit Points)
Pearson Correlation Coefficients Between the Variables From Study 2 (N = 50 forTrailsAand UFOV; N = 38 for Demerit Points)×
Variable1234
1. Trails A
2. UFOV-1.51
3. UFOV-2.80.59
4. UFOV-3.81.50.82
Demerit points.57.61.67.72
Table Footer NoteNote. Trails A = Trail Making Test Part A; UFOV = Useful Field of View test. All ps < .001.
Note. Trails A = Trail Making Test Part A; UFOV = Useful Field of View test. All ps < .001.×
Table 4.
Pearson Correlation Coefficients Between the Variables From Study 2 (N = 50 forTrailsAand UFOV; N = 38 for Demerit Points)
Pearson Correlation Coefficients Between the Variables From Study 2 (N = 50 forTrailsAand UFOV; N = 38 for Demerit Points)×
Variable1234
1. Trails A
2. UFOV-1.51
3. UFOV-2.80.59
4. UFOV-3.81.50.82
Demerit points.57.61.67.72
Table Footer NoteNote. Trails A = Trail Making Test Part A; UFOV = Useful Field of View test. All ps < .001.
Note. Trails A = Trail Making Test Part A; UFOV = Useful Field of View test. All ps < .001.×
×
Table 5.
Reproducibility Statistics
Reproducibility Statistics×
SubscalePearson r(p)ICC (95% CI)Mean Differencet-Test Result (p)
1. Starting, stopping, and backing
 Rater 1 (live) vs. Rater 1 (playback).69 (<.001).59 (0.34-0.77)−11.35.81 (<.001)
 Rater 1 (live) vs. Rater 2 (playback).63 (<.001).58 (0.32-0.76)−6.33.50 (.001)
 Rater 1 (playback) vs. Rater 2 (playback).84 (<.001).83(0.70-0.91)5.03.45 (.001)
2. Signal violations, right-of-way, and inattention
 Rater 1 (live) vs. Rater 1 (playback).32 (.051).32 (0.00-0.58)−1.71.07 (.293)
 Rater 1 (live) vs. Rater 2 (playback).66 (<.001).65 (0.42-0.80)−2.11.69 (.099)
 Rater 1 (playback) vs. Rater 2 (playback).66 (<.001).65 (0.42-0.80)−0.40.32 (.750)
3. Moving in the roadway
 Rater 1 (live) vs. Rater 1 (playback).66 (<.001).62 (0.38-0.78)0.80.27 (.786)
 Rater 1 (live) vs. Rater 2 (playback).55 (<.001).55 (0.28-0.74)4.11.48 (.148)
 Rater 1 (playback) vs. Rater 2 (playback).82 (<.001).80 (0.65-0.89)3.31.51 (.141)
4. Passing and speed
 Rater 1 (live) vs. Rater 1 (playback).65 (<.001).63 (0.39-0.79)−13.24.02 (<.001)
 Rater 1 (live) vs. Rater 2 (playback).78(<.001).73 (0.54-0.85)−3.71.22 (.230)
 Rater 1 (playback) vs. Rater 2 (playback).72(<.001).71 (0.51-0.84)9.52.78 (.009)
5. Turning
 Rater 1 (live) vs. Rater 1 (playback).64 (<.001).60 (0.35-0.77)−20.96.99 (<.001)
 Rater 1 (live) vs. Rater 2 (playback).45 (.051).41 (0.11-0.65)−22.05.91 (<.001)
 Rater 1 (playback) vs. Rater 2 (playback).71 (<.001).71 (0.50-0.84)−1.00.35 (.730)
Table Footer NoteNote. ICC = intraclass correlation coefficient; CI = confidence interval.
Note. ICC = intraclass correlation coefficient; CI = confidence interval.×
Table 5.
Reproducibility Statistics
Reproducibility Statistics×
SubscalePearson r(p)ICC (95% CI)Mean Differencet-Test Result (p)
1. Starting, stopping, and backing
 Rater 1 (live) vs. Rater 1 (playback).69 (<.001).59 (0.34-0.77)−11.35.81 (<.001)
 Rater 1 (live) vs. Rater 2 (playback).63 (<.001).58 (0.32-0.76)−6.33.50 (.001)
 Rater 1 (playback) vs. Rater 2 (playback).84 (<.001).83(0.70-0.91)5.03.45 (.001)
2. Signal violations, right-of-way, and inattention
 Rater 1 (live) vs. Rater 1 (playback).32 (.051).32 (0.00-0.58)−1.71.07 (.293)
 Rater 1 (live) vs. Rater 2 (playback).66 (<.001).65 (0.42-0.80)−2.11.69 (.099)
 Rater 1 (playback) vs. Rater 2 (playback).66 (<.001).65 (0.42-0.80)−0.40.32 (.750)
3. Moving in the roadway
 Rater 1 (live) vs. Rater 1 (playback).66 (<.001).62 (0.38-0.78)0.80.27 (.786)
 Rater 1 (live) vs. Rater 2 (playback).55 (<.001).55 (0.28-0.74)4.11.48 (.148)
 Rater 1 (playback) vs. Rater 2 (playback).82 (<.001).80 (0.65-0.89)3.31.51 (.141)
4. Passing and speed
 Rater 1 (live) vs. Rater 1 (playback).65 (<.001).63 (0.39-0.79)−13.24.02 (<.001)
 Rater 1 (live) vs. Rater 2 (playback).78(<.001).73 (0.54-0.85)−3.71.22 (.230)
 Rater 1 (playback) vs. Rater 2 (playback).72(<.001).71 (0.51-0.84)9.52.78 (.009)
5. Turning
 Rater 1 (live) vs. Rater 1 (playback).64 (<.001).60 (0.35-0.77)−20.96.99 (<.001)
 Rater 1 (live) vs. Rater 2 (playback).45 (.051).41 (0.11-0.65)−22.05.91 (<.001)
 Rater 1 (playback) vs. Rater 2 (playback).71 (<.001).71 (0.50-0.84)−1.00.35 (.730)
Table Footer NoteNote. ICC = intraclass correlation coefficient; CI = confidence interval.
Note. ICC = intraclass correlation coefficient; CI = confidence interval.×
×