Last changed
27 Nov 1997 ............... Length about 2,000 words (12,000 bytes).
This is a WWW document by Steve Draper, installed at http://www.psy.gla.ac.uk/~steve/task.html.
You may copy it. How to refer to it.
Draper, S.W. (1993) "The notion of task in HCI" pp.207-208 Interchi'93
Adjunct proceedings (eds.) S.Ashlund, K.Mullet, A.Henderson,
E.Hollnagel, T.White (ACM)
The notion of Task in HCI
Stephen W. Draper
GIST (Glasgow Interactive Systems cenTre)
University of Glasgow, Glasgow, U.K.
steve@psy.glasgow.ac.uk
The ISO definition of the usability of an interface is "the effectiveness,
efficiency, and satisfaction with which specified users can achieve specified
goals in a particular environment". This at first seems pessimistic to many
people, as it implies that there may be no generalisation across users or
machines or tasks: that measuring how one combination performs may not tell us
anything about how others will perform. But is it pessimistic enough? It
expresses what many HCI workers assume, that just as it is clear what a "user"
is (distinct users can be identified by their bodies — if it is the same
person then it is the same user), so a task is the same thing to all people in
all circumstances. This paper points out that this is not true, examines the
extent to which this may be a problem, and how it threatens standard practices
of both psychologists and designers in HCI.
Consider, then, setting a standard HCI task. For instance, we might ask a user
to prepare a business letter on a word processor with some given text content.
However the layout chosen (and so the bulk of the features and commands used)
will probably depend on whether that user is a secretary or an undergraduate
hired for an experiment. Even if, instead of just saying "prepare a business
letter", we give them a printout of the target letter to be achieved, different
kinds of people will simply notice or not notice different aspects of the
layout. (We have experienced this as a real problem in conducting HCI
experiments.) For instance few will think that the exact size of the right
margin to the nearest millimetre actually matters, and many will just not see
it as any part of the task, any more than matching the watermark in the paper
used for printing would be. But this is not just about whether undergraduates
are representative of "real users". What aspects of layout a secretary cares
about will depend on the job: probably on the individual he or she is working
for, certainly on practices developed in a particular office and job. Thus
"preparing a business letter" as it affects the use of a word processor (and
therefore which commands are used and important) simply is not determined by a
fixed task domain which can be observed, recorded, and generalised about.
If the notion of task is so unstable and indeterminate in small scale word
processing — surely the simplest and most overstudied case in HCI — then what
hope is there of using it as a theoretical concept, as a component of standards
definitions, or as the basis of design methods using "task analysis"? If we
consider asking someone to prepare a diagram to illustrate a paper, then we can
be sure that no two people will produce the same graphic, even if they have
similar backgrounds and are made to use the same computer package. Moving
upwards to larger scale work domains, the influence of communities of practice
is likely to be even more important. For instance, consider indentation and
commenting practices among programmers
On the other hand, commonsense and experience tell us that a notion of task is
of constant practical help in HCI design. Everyone needs to save their files
to disk or to delete a word, and you can look at the problems some designs
observably cause users, and how changing the design removes or changes the
problems. These simple considerations suggest that perhaps "task" is a simple
concept at the level of a command, and problems occur mainly at higher levels.
This would be consistent with the idea that an artifact (re)defines the task
facing a user, so that task analysis at this level may work. We could then
expect to go along with criticisms of task analysis (Bannon & Bødker
1991) as they apply to whole tasks like typing a letter, while retaining tasks
as mental goals at a lower level. Thus top down task analysis may be largely
useless except as a way of reproducing old ways of work (by reproducing in a
new implementation the same tasks, probably many of them bad). This criticism
then would undermine top down analyses, but not all uses of the task concept.
This is probably broadly right; however some considerations suggest things are
not as simple as this.
In psychology, the variable nature of a "task" has been studied for some time
under the name of "demand characteristics". As a recent text book puts it,
people will do far more for an experimenter than they would in other
situations: simply consider asking a friend to do as many pushups as they can
a) as part of a psychological or medical experiment b) just because you'd like
to watch them. Clearly, the real task subjects are engaged in is "doing an
experiment", and their willingness usually strongly depends on this: the worry
is that other aspects of their behaviour will too. This serves as a sharp
reminder that simply handing each subject a slip of paper with a short
description of "the task" not only does not mean that every subject will
interpret it in the same way, but does not mean that they will treat it as they
would if it arose in another context. In usability experiments this may not
matter much, because the interfaces we study today often have such severe bugs
that simply trying 10 times as hard as normal does not allow subjects to
succeed. It may however matter rather more for studies trying to observe
naturally occurring "tasks".
Even when tasks (goals) are fixed for a user, it seems that the methods they
naturally develop are rather variable. This was Allen & Scerbo's (1983)
finding: that GOMS predictions did not match experimental findings unless not
only the task but the exact method was (most unnaturally) dictated to subjects.
Thus a barrier to predicting user execution times is that users do not adopt
the methods predicted by designers for a task. Suchman's (1987) arguments too
are largely that, even if you study cases where tasks are relatively clear and
fixed, humans do not generate and follow fixed plans of the kind expected by
naive theory. It would seem then that we cannot expect fixed and predictable
behaviour from human users even at quite "low" levels. Therefore whenever the
device allows any variation in method, task analyses are not likely to work at
low levels. Furthermore, if even methods are normally rather variable, this
must increase expectation that higher level tasks will be variable and
unpredictable.
Task analysis as often applied to design in fact often misses real tasks faced
by users. For each command the designer is often only thinking of supporting
one "task", yet up to four are in fact commonly at issue.
- Performing the function e.g. a user wants to move a sentence, and this might
be supported by using a sequence of the Cut and Paste commands.
- Verifying success: almost always, a user does not just want to achieve a
goal, but to know they have. (Hence, in older interfaces, the prevalence of
issuing information commands such as ls in Unix after nearly every command.)
After moving the sentence, the user will, in a WYSIWYG editor, have to read it
in the new context carefully to make sure nothing is missing. (This could be
better supported e.g. by leaving newly changed text highlighted in some way.)
- Discovering how to perform the function. The first time users need to do
that function, they must somehow discover the method: a knowledge getting task.
For a function directly supported by a single command, the command name
appearing in a menu may be sufficient. For the example of moving text by cut
and paste, this is unsupported except via documentation in many current
interfaces.
- Given the visible presence of cut and paste commands, users may wish to
discover their function (learn by exploration). This requires visible and
comprehensible effects (usually not the case for implementations using hidden
clipboards).
Another class of "task" that is well known in practice but does not fit well
with the usual notion of task is that of defence costs. In many applications
on personal computers users find it wise to save their work frequently to disk
as crashes lose all work since the last save. Saving is not a natural user
task at all, except perhaps when making a spare copy to give to someone else
(with paper, you don't have to first write, and then "fix" the trace like a
photograph being developed). When it is analysed as a task, it is usually
analysed as something users "want" to do when quitting the application, and is
often integrated with exit commands. However rather fewer designers show that
they have grasped the real issue which is to save frequently, not in order to
achieve anything definite, but as insurance. This kind of goal should be
described as a defence cost, since you either have to pay a small frequent
"insurance" cost (e.g. saving to disk) which normally turns out to be useless
(it is superceded by the next save), or very rarely pay a huge penalty. Their
nature is that they do not achieve a direct benefit for the user, and they are
not part of a hierarchical goal structure. Because of this they do not fit
into the simpler forms of task analysis, and conversely the user needs to be
reminded to do them i.e. they need different kinds of interface support.
A rather different problem with task analysis is raised by an example of
Sørgaard's (1988). He discusses designs for a seat reservation system
for railways. He criticises one which only supports requests for seats that
allow a few salient attributes to be specified e.g. window or aisle, facing
forward or back, and proposes one that shows the seating plan for the whole
train in a diagram that both customer and clerk can see and point to. This
design allows customers to point to the seats they prefer, and allows them to
take many possible attributes and relationships into account e.g. distance from
the dining car, distance from the door, etc. The design finesses any laborious
attempt to discover all the "tasks" customers are trying to perform, and yet
will support a far greater range of specification types including those that
might be very rare in the user population as a whole. It seems clearly
superior exactly because it will support tasks the designer could not have
anticipated.
Since neither the methods nor the tasks chosen and evolved by users can be
predicted accurately, it is necessary to allow these practices to evolve and
then to observe them. In other words, these considerations lead back to the
use of a prototyping cycle for task "analysis" as for other aspects of design,
as Henderson (1991) has stated explicitly. It leads to a 5 step cycle, the
extra step, which follows implementation, being the evolution of use and
practice.
Allen, R.B. & Scerbo, M.W. (1983) "Details of command language keystrokes"
ACM TOIS vol.1 pp.159-178.
Bannon, L.J. & Bødker, S. (1991) "Beyond the interface:
encountering artifacts in use" ch.12 pp.227-253 in Carroll J.M. (ed.)
Designing Interaction: psychology at the human-computer interface
(CUP).
Henderson, A. (1991) "A development perspective on interface, design, and
theory" ch.13 pp.254-268 in Carroll J.M. (ed.) Designing Interaction:
psychology at the human-computer interface (CUP).
Sørgaard, Pål (1988) A discussion of computer supported
cooperative work Ph.D. thesis, Aarhus, Denmark
Suchman, L.A. (1987) Plans and situated actions: the problem of
human-machine communication (CUP).