Last changed 15 March 1999 ............... Length about 1,400 words (10,000 bytes).
This is a WWW document by Steve Draper, installed at http://www.psy.gla.ac.uk/~steve/talks/mill.html. You may copy it. How to refer to it.

Web site logical path: [www.psy.gla.ac.uk] [~steve] [Lists of seminars] [talks by me] [this page]

What is the significance of the Millennium bug for computer science?

(N.B. Another, related, seminar will be held in psychology.)

Organisers: Steve Draper; Ray Welland

Time: 4-5pm, Friday, 12 March 1999 in the CAKES talk series. Room F161, Computing Science dept. (17 Lilybank Gardens).

Abstract

Do we need to do anything much about the systems/equipment in our charge?
Need we worry about systems/equipment that is not in our charge? Is society / some important infrastructure services going to collapse? [A millennium catastrophe]
What does it all mean for computer science?
Does it show that the current computer science curriculum is out of date, since the skills recommended for addressing the millennium bug are completely missing from it?

Is it a case of Y2K bug or a true millennial catastrophe? A technical problem offering an employment boom for some, or the end of society and life as we know it? (How appropriate that in today's society, the fears expressed by millennial movements are given a technical rather than religious form.) Or is it not a technical problem, but a management one that computer scientists are untrained to deal with, thus suggesting that the present DCS curriculum is hopelessly out of date?

Plan

The session will be 60 (not 30, not 50) minutes long. Basic plan is to have a string of speakers mainly limited to 5 mins. each. Speaker list and provisional times at the end.

1. Do we need to do anything much about the systems/equipment in our charge?

Speaker? David Fildes (university officer coordinating year2000 problems).

The problem

Sketch out the existence, size, impact.
Estimated cost and scale [David Fildes?]
Employment generated.

The scale of the problem in money already being spent and likely to be lost means this is one of the biggest features of current computing practice. Anyone who wants their students to be employed cannot regard it as unimportant.

Approaches to fixes

Another is the quite different possible approaches to tackling it.

One is to design the software better (and replace all existing software?).
Fix the software: use the analysis tools now appearing to identify all the places in existing code that need to be checked.
Rely on testing (abandon reliance on understanding the code) i.e. reset clocks now, and try to respond to any resulting crashes.
(The predominant advice in the university) View this not as a design or technical problem, but as one of management: contingency planning ...

Official strategy

The university's response [Fildes]
DCS response /plan [Welland]

University of Glasgow newsletter, (206)
"The start of the year 2000 is to be marked by a four day close down of the computing system and networks with the exception of the main server into the University and one system in Computing Science which may be studying how computing systems cope with the start of the year 2000. Power and heating systems would not be shut down and the director of Estates & Buildings reported that he was considering bringing extra staff in over the period to deal with any difficulties should they arise."

2. Need we worry about systems/equipment that is not in our charge?

Is society / some important infrastructure services going to collapse? The argument is, that there will certainly be some problems, and some of these may cause major services (electricity, phones, water, ... food supply to supermarkets, the banking system, ...) to fail. Arguments for this are:

Too little work too late.
Nature of major accidents in the past is that when a whole lot of improbable causes all occur together, you get a major diasaster. So there will be a large number of problems, almost all of which will fail to spread and trigger other problems, but in a few unforseeable cases, a disastrous cascade will occur.
Some disasters will occur not because of technical failures, but because a significant minority of people believe they may. The banking system has always depended on confidence: if people all want their money back at once, it collapses independent of any technology. Food supplies will fail, if many people start hoarding to be "safe"; particularly given the JIT approach now developed, but not yet tested by a severe perturbation.

Matthew Chalmers, who cannot participate on this date, says that rumours around big banks indicate:

The more closely someone is working on the technical programming problems, the more they are stockpiling food and taking precautions against major societal failures
If a bank's software fails for 3 days, it will go bust. But that will only be true if their's fails and others' doesn't. (So: At the moment they are working to fix it, and possibly hoping that competitors will fail; but perhaps nearer the time cartels will agree to suspend life, I mean business, together...)

Speakers: Steffie Plotnikoff.

3. What does it all mean for computer science?

One interesting feature is the enormous uncertainty and absence of consensus about it, not just among journalists, but among big computer companies and academic experts. How can we be so ignorant about the behaviour of the machines we design and depend on (and which in principle, and unlike all the rest of the universe, are in principle totally deterministic)?

It marks the coming of age of computer science. In all other design disciplines, design is normally about fitting a new artifact into a surrounding that cannot be changed, however difficult that makes it for the new technology. Few town planners, for instance, are allowed to demolish or ignore surrounding buildings; and railway engineering similarly is about fitting in with and building on the technology of the past. Computer science up to now, because so little software already existed (and what did, had such a short average lifetime) has been allowed to proceed with the fantasy that each new design was going to exist independently of other software; or, equivalently, was going to be a part of a system whose design was coordinated and known. The millennium bug shows that in reality, "legacy code" is some of the most important software in existence, and grownup design (as opposed to academic design exercises) requires designing around existing code which not only is outside the designer's control, but typically is no longer understood. That is, its behaviour is not fully specified, and cannot be reasoned about using methods that might work within a fully designed system.
Thus defensive programming needs to become a fundamental technique.
Similarly, all designs need to have recovery methods when situations occur that are "impossible" and unforeseen. This approach is familiar for "rebooting" (rebuilding a complete runtime code image), and often is there for rebuilding software from components; but is often not there for rebuilding data.
Fundamentally, all this means that methods for reasoning about and designing to accommodate uncertainty must move to the fore. This in fact is a fundamentally anti-digital move.
The point of digital systems (whether cogwheels or logic gates) is to achieve complete predictability. However this mode of calculation not only does not apply to interactions of digital machines with humans, it is proving incompetent to manage interacting complexes of digital machines in the face of uncertainties from other sources.
Is technical understanding in fact not very useful in real computing applications? If the effects, never mind the cure, of the millennium bug cannot be predicted then surely there is no science whatsoever (and not much engineering either) in "computer science"? Prediction is the test of science (control is the test of engineering). Certainly there appears here to be a stark limit to our understanding in practice, and this must have vital implications to the kinds of understanding we should be seeking.
It could in fact be argued that this, rather than the defeat of the digital revolution, is actually simply the unrecognised consequence of going digital. Whereas in analogue codes, nearby values have similar meanings, in most digital codes nearby values have no semantic relationship. This means that digital systems go chaotic in the chaos-theory sense as soon as any errors appear. 00 being next to 99 in modular but not normal arithmetic is simply an elementary example of this. So are digital designs fundamentally unstable engineering (in the sense that bicycle steering and aircraft wings can be designed to be stable or unstable)?

4. Is the computer science curriculum totally out of date?

Does it show that the current computer science curriculum is out of date, since the skills recommended for the millennium bug are completely missing from it? The truly striking thing about the millennium bug is that the best expert advice for dealing with it does not focus on programming solutions, even though that is widely perceived as the original "cause". Instead, they focus on risk management techniques. These are not on the computer science curriculum. Computer "scientists" are therefore professionally unqualified and incompetent to deal with the millennium bug: if you want your organisation to survive it, don't hire someone with a computer science degree: you need management skills they don't have.

In fact although currently people running IT support services may come from programming and computer science, the skills they use in their jobs are quite different from this: they don't program, and they do manage. The curriculum is already severely out of line.
Speakers: IT industry person, Pete Bailey, Norman Davis? The question: what skills do they use day to day in fact?

Thus the millennium bug is showing up the possibility that IT professionals now and in the future need quite different skills, and programming may soon be as marginal to those in the business as hardware design skills are now. Hardware used to be on the curriculum 25 years ago, now it is in electrical engineering. Soon either computer science will transfer algorithms and programming out to a dept. of programming, or it will wither away, while new departments such as "the Glasgow school of business information management systems" will become dominant in both undergraduate education and the interesting research in systems design.

Provisional timetable

Steve Draper. Introduction. [1 min.]

Do we need to do anything much about the systems/equipment in our charge?

David Fildes (University Year 2000 officer). The technical problem. [5 min.]

Ray Welland. The DCS response. [1 min.]

Need we worry about systems/equipment that is not in our charge?

Marie-Odile Bes. How unexpected combinations of rare events cause disasters. [5 mins.]

David Fildes. Your responsibilities for others' breakdowns. [5 mins.]

Steffie Plotnikoff Why this is not just a technical problem. [5 mins.]

What does it all mean for computer science?

Steve Draper. (For content, see above) [5 mins.]

The current computer science curriculum

Steve Draper. [5 mins.]

Norman Davis: the skills IT support REALLY uses [5 mins]