Paddy O'Donnell &
Stephen W. Draper
GIST (Glasgow Interactive Systems cenTre)
Department of Psychology
University of Glasgow
Glasgow G12 8QQ U.K.
email: steve@psy.gla.ac.uk
Keywords: time, system delay, user strategies
More recently Teal & Rudnicky (1992) offered another view: instead of focussing on annoyance and how it depends on duration or unpredicability, they suggested that delays can change user strategies i.e. the procedures users select for a task. Users do not necessarily sit passively, if impatiently, waiting for the interface to accept the next command, but organise their behaviour around delays, trying to anticipate the machine's behaviour. Teal & Rudnicky reported two experiments whose results they interpreted as showing that users had a choice of three strategies for the task, and that the strategy selected depended simply and directly on the magnitude of machine delay relative to an individual's typical reaction time. We set out to replicate and extend Teal & Rudnicky's findings, but encountered more mixed results suggesting that other issues also affect a user's choice of strategy.
Three user strategies (i.e. methods) for this task appear. The first is the "automatic" strategy suitable for negligible delays, where the user enters numbers as fast as they can copy them, without checking the screen for the prompt. The third is the "monitoring" strategy suitable for very long delays, where the user enters a number, reads the next one, looks at the screen until the prompt appears, then enters the next number. Between these is the "pacing" strategy, where the user does not use the perceptual cue of the prompt but instead estimates the delay needed before typing the next number. This can be faster for some delay lengths because, while avoiding (most) errors from premature typing, it saves moving the head and eyes to look at the screen for the prompt, and saves the time taken to react to seeing the prompt i.e. the perceptual and other processing causing a response latency of at least that of simple reaction time. If users are good enough at estimating they can in principle cut this wasted time gap to zero.
The variables used to show changes in user strategy are the initial keystroke latencies and anticipation errors. Subsequent keystroke latencies and other keying errors are also measured.
Initial Keystroke Latency (IKL): The time that elapses between when
the computer signals it is ready for the next input and when the user actually
strikes the first number key of the number being entered (i.e. the user
delay).
Anticipation Errors (AE): When the user attempts to enter a keystroke
prior to the system becoming available for the next user input.
Subsequent Keystroke Latency: The inter-keystroke times for all user
inputs other than the initial keystrokes.
Keying Errors: All performance errors other than anticipation errors
(e.g. transcription errors).
Teal & Rudnicky argued that in their two experiments, users moved between
the strategies as a function of machine delay, selecting the strategy which is
most efficient. Their results supported the appearance of the three different
strategies and showed the approximate machine delay ranges in which they
appeared:--
Automatic strategy: 0.00 - 0.625 secs
Pacing strategy: 0.625 - 2.25 secs
Monitoring strategy: > 2.25 secs.
In fact, as they argue, the optimum switchover point should (and apparently does) depend on the individual subject's speed of reaction. It is only worth pacing when the machine delay is longer than the user's delay: in particular, than the user's natural time between pressing <return> on the previous number group and pressing the first digit of the next number group. (This time is presumably consumed by reading the next number, deciding on the motor action, and actually pressing the key; less any time saved by overlap e.g. reading the next number while keying the previous group.) This user characteristic k can be measured as the subject's modal IKL on a block with zero machine delay i.e. their reaction time in this task when machine delays are removed.
As mentioned, each block consisted of 40 trials (one 3 digit number per trial) with a constant machine delay. The first 15 trials were discarded to allow users to settle on a strategy (they had no other way of discovering the delay within a block), and the last 25 trials were used as data for analysis. This was based on Teal & Rudnicky's experience that 15 trials was entirely adequate for stabilisation. When subjects made an anticipation error, the associated IKL was removed from the totals used for calculating means and standard deviations for IKLs. (These were rare enough for this not to threaten sample size.)
Figure 1 shows the modal IKLs and the AE rates obtained in each delay condition, comparing our results with Teal & Rudnicky's. The graph is marked for the interpretation of which strategy dominated in each band of delays. The peak was thought to correspond to a transition region with no uniform strategy between automatic and pacing. Our data is clearly not very similar to theirs, although the supposed pacing region does correspond quite well.
Figure 1: Our results plotted with those from Teal & Rudnicky (1992).
Statistically, evidence was sought for distinct regions, beginning with a one-way ANOVA comparing all 12 delay conditions with each other. Unlike Teal & Rudnicky, we failed to find any significant differences for the IKL measure, although we did for AEs (F[11, 121] = 2.68, p < 0.05). A Tukey HSD was therefore carried out for the AE rate to find any significant differences between delay regions. This produced two regions -- region 1 [k-0.25, k+0.25] and region 2 [k+0.25, k+2.25] at p < 0.05.
For IKLs a Tukey's HSD could not be executed as the data sample sizes were unequal (due to exclusion of entries on which an anticipation error was made), therefore a planned comparisons procedure was carried out. By comparing all delay conditions separately and then identifying groups significant regions were identified. For IKL two regions were produced -- region 1 [k-0.25, k+1.00] and region 2 [k+1.25, k+2.50] at p < 0.05. The next region identified for the anticipation error rate is k+0.25 to k+2.25. Within this region initial keystroke latencies and initial keystroke latency standard deviations have overlapping regions, [k+1.25, k+2.50] and [k+1.00, k+1.75] respectively. In this region of overlap all the variables remain relatively low and stable, showing the use of a pacing strategy.
Why the differences, and what does this say about the main issue of user selection of alternative strategies as machine delays vary? The first point is that when we recreated the experiment it became introspectively obvious to all who tried it out that there are indeed three strategies. This is much clearer from introspection than from reading accounts of the experiment. It is much less clear, however, how and when users select or switch a strategy. One issue then is the attempt to use the traditional measures of time and error to describe and examine these strategies. It does not seem possible to ask users to report their strategies systematically on each trial, as this would take a long time compared to the task, and since the task is meant to be highly practiced, continuous verbal reporting is very likely to interfere with it.
The next point is that, even on the Teal & Rudnicky account, there is not in fact always a stable fixed strategy. Their interpretation of the large peak of errors and time is that this is the transition region where users oscillate between automatic and pacing strategies, and subjects' verbal reports back this up. In fact strategies may seldom stay constant throughout a block, let alone across all subjects for a given block. For instance, whenever an error is committed the subject is more or less bound to look at the screen while they correct the error, and will probably continue to look at the screen frequently for the next trial or two: but this in effect means they use the monitoring strategy during the recovery period, and will probably show markedly slower IKL times for a few trials. A promising direction for future work, then, is to develop an analysis that can identify and isolate such sequences within blocks.
Errors are particularly important in the pacing strategy, where subjects are estimating how long to pause before typing without checking that the prompt has appeared. If subjects pause too long they waste time and make the strategy less worthwhile, but if they pause too little they commit an error. Errors incur time penalties which both motivate the subject to be more cautious in future, but also distort the summary measures. Caution may simply make subjects increase their estimate of the pause duration needed, but they may also tend to make them switch strategy (to monitoring) either for a few trials or permanently.
This issue was probably more important to us than to the original studies, because our setup imposed a greater penalty on users for each error. In Teal and Rudnicky's experiment subjects, although instructed to avoid errors, did not have to correct an incorrect number once entered. In our replication the computer would not move on to the next number until the previous number had been entered correctly (which corresponds better to normal use). Therefore, besides getting an error signal, a subject has the additional trouble of stopping and correcting the wrong entry before moving on to the next number. Presumably, then, subjects will try harder not to make errors, leading to the generally lower AE rate we obtained. This can be done by maintaining higher IKLs, and our results also show that the replication generally has higher IKLs than in Teal & Rudnicky's experiment.
Similarly we believe that our failure to observe clear evidence of the automatic strategy was largely due to significant subject fatigue, leading in turn to a kind of strategy switch within the block. A pure execution of the automatic strategy would mean neither pausing nor glancing at the screen throughout the block. In fact more natural and typical behaviour is to pause at times for various reasons: to rest, adjust posture, look twice at the copy sheet, check the screen for any evidence of errors. If this anomalous pause time is simply added into the IKL means, this will be misleading to simple analyses. The high variances we observed are consistent with this view.
G.L.Dannenbring (1983) "The effect of computer response time on user performance and satisfaction: a preliminary investigation" Behavior research methods & instrumentation vol.15 no.2 pp.213-216.
J.Gibbon (1977) "Scalar expectancy theory and Weber's law in animal timing" Psychological Review vol.84 pp.279-335
G.N.Lambert (1984) "A comparative study of system response time on program developer productivity" IBM Systems journal vol.23 no.1 pp.36-43
G.L.Martin & K.G.Corl (1986) "System response time effects on user productivity" Behaviour and Information Technology vol.5 no.1 pp.3-13.
A.Rushinek & S.F.Rushinek (1986) "What makes users happy?" CACM vol.29 no.7 pp.594-598
S.L.Teal & A.I. Rudnicky (1992) "A performance model of system delay and user strategy selection" Proc. CHI '92 pp.295-305