In very, very crude terms, there are two kinds of defect in computer programs: those you notice and those you don’t. In general, we tend to get most exercised about those we notice because the ones we don’t notice can’t really be very important because we don’t notice them. Wrong.
This all came to mind when a gentleman from Malaysia received a telephone bill for $218 trillion in April. This was widely reported last month so I will say no more about it, but the point I would like to make is that he was lucky. It is obviously wrong – even a teenager with a 3G mobile couldn’t rack up that kind of bill unless they logged onto a web site in the Andromeda Nebula for a year or two. Eventually even the Malaysian telephone authority admitted it was an error.
Suppose, however, the phone company had falsely issued a bill for $2,180, or even $21,800? How long do you think it would have taken for it even to agree it was an error? Such defects are much more insidious because they are not obviously wrong.
Numerical mistakes at the level where erroneous results seem “reasonable” are surprisingly common. Let me take you back to an experiment I was involved with in the 1990s. It measured how accurately seismic surveying software predicted where to drill for oil. At the time, there were nine different packages, all written to the same (mathematically defined) requirements in the same programming language in deadly competition, so there was no collusion. The packages had racked up thousands of execution years figuring out where to drill oil wells. Drilling oil wells in the North Sea costs around $25m so it’s fairly important to get it right.
We decided to give them the same data and the same disposable parameters to see if they came up with the same answer, an expensive experiment graciously funded by Enterprise Oil. The slightly embarrassing result was that we obtained nine different answers, but, and this is the important bit, each answer looked reasonable on its own.
The variations were entirely caused by previously unnoticed software defects and this in an industry with extensive quality-control processes for its software development. Furthermore, the software defects that caused the problem had been in their respective packages for between 1,000 and 2,000 execution years, so you can’t trust a program just because it’s been around for a while.
This kind of thing is not confined to the oil industry, so it does no harm to check the odd financial calculation every now and then just to make sure that nobody has screwed up. You may be surprised by what you find.










reader comments