29 Sep 2002 ............... Length about 900 words (6000 bytes).
This is a WWW document maintained by
Steve Draper, installed at http://www.psy.gla.ac.uk/~steve/grumps/clientneeds.html.
Web site logical path:
Grumps Client needs
Grumps clients are pursuing REDDIs (rapidly evolving data driven investigation).
These depend upon two kinds of software (besides whatever software is being
- The Grumps software infrastructure for data collection across distributed
systems and networks.
- Some specific data generator, typically a way of getting logging data from
the operation of a target software system. Examples include
- Our "action collector", which generates Grumps events for collection from
- Our Java bytecode tool, which helps add event generation for collection to
any Java program without access to or understanding of the original source code.
- Logfiles created by particular pieces of software e.g. the PRS software
for lecture theatre handsets.
- Any hand-coded log data facility included in any piece of software.
For a Grumps client to carry out a REDDI, investigating data for some purpose
of their own, they need to cover many other functions in addition. There are
basically three ways of covering each of these functions: by human expertise
the client has (in addition to their domain knowledge), by human expertise
supplied by Grumps as additional support, or by pre-fabricated solutions
already created and accumulated that can be re-used by clients.
(For instance, clients using the level 1 computing lab data can probably share
some data retrieval and data cleaning work; the display software written to
support Quintin's project might turn out to be a default skeleton other
clients could use; external data mining software packages could turn out to be
a help for some of these functions.)
In many cases these types of support or expertise may not require much effort
as measured in expert-person-hours, yet without every bit being supplied in
one way or another, a REDDI will probably wither. Also, the speed of turnround
on these calls on expertise may be important too, since REDDIs are meant to be
These types are:
- Maintainence / installation of the grumps software. It is unlikely
within the project ever to be so robust and well engineered that it can be
used without any help.
- Database management. The grumps software can typically produce a big
flood of data, to be collected in some sizeable relational database. Someone
has to decide on a machine to host this, and to set it up and do "sysadmin"
type management of it.
- A client to drive the investigation, with an active goal or theory or
other purpose for using the data. At the least it is from them that decisions
flow about whether that purpose is achieved already, or if not then in what
- Someone to direct collection: decide on what data to collect. Typically
also the client.
- A data retriever: someone who can write (and debug) successful SQL
expressions to extract data of interest to the investigation from the
- A data cleaner: someone who writes the code to clean up the data for the
investigation. Essentially the same skills as the retriever, but the function
of cleaning is likely to take much greater amounts of time. On the other
hand, data cleaning actions may be re-usable across miniprojects to a useful
- A software writer for producing displays of the (retrieved and cleaned)
data. Without a means of presenting results effectively to the investigators,
nothing is achieved.
- A statistician: turning a hypothesis from a hunch into an English spec.,
and then the spec. into a precise question that can be asked of the data /
numbers requires at least some statistical expertise.
- A user liason or RA: in many cases, the human users being studied will
need to be "handled", from informing and persuading them, to collecting
consents, to conducting controlled experiements where some users do prescribed
- Running calibration experiments.
There is often a need to run mini-experiments with the data collection on,
where you (or users taking your instructions for a set task) do specific
actions, and then you look at the generated data to find, or validate,
associations between human actions and patterns in the data.
This may be used to adjust the collector, or a filter, or a data cleaning
Web site logical path:
[Top of this page]