'Uncrashable' computer system could end the pain of lost data
MIT research could have real-world implications
Researchers at the Massachusetts Institute of Technology (MIT) have unveiled work on creating an ‘uncrashable' computer system that could mean an end to data being lost when a computer crashes.
The researchers in the university's Computer Science and Artificial Intelligence Laboratory (CSAIL) designed a system mathematically guaranteed not to lose track of data in the event of a crash.
MIT explained that this was done using a process call 'formal verification' that limits the "acceptable bounds" of operation for a computer program and then guarantees that they cannot be exceeded. This should ensure that the system cannot crash.
Nickolai Zeldovich, a principal investigator at CSAIL, who co-authored the new paper, explained that, while the system is slow at present, it is a major breakthrough.
"Making sure that the file system can recover from a crash at any point is tricky because there are so many different places that you could crash. You literally have to consider every instruction or every disk operation and think: ‘Well, what if I crash now? What now? What now?'" he said.
"So empirically, people have found lots of bugs in file systems that have to do with crash recovery, and they keep finding them even in very well tested file systems because it's just so hard to do."
MIT explained that the researchers managed to overcome previous limitations in this area by focusing on the final code of the system rather than a "high-level schema" so that they could dictate operations directly to the code.
This required a tool known as a 'proof assistant', called Coq, to "provide a formal language for describing aspects of a computer system and the relationships between them".
Adam Chlipal, associate professor of computer science at MIT, explained that this was the key difference that helped the team with its research.
"We implement the file system in the same language where we're writing the proofs. And the proofs are checked against the actual file system, not some whiteboard idealisation that has no formal connection to the code," he said.
The team also had to overcome the fact that it was creating a system that has never existed before, so there were no standards or prior knowledge to refer to.
However, now that this has been achieved, it could speed development in this field, as Frans Kaashoek, the Charles A. Piper Professor in MIT's Department of Electrical Engineering and Computer Science, explained.
"It's not like you could look up a paper that says: ‘This is the way to do it.' But now you can read our paper and presumably do it a lot faster," he said.
The findings of the project have already piqued the interest of engineers at major firms, such as Ulfar Erlingsson, lead manager for security research at Google, who has been following the project.
"It's not like people haven't proved things in the past, but usually the methods and technologies, the formalisms that were developed for creating the proofs, were so esoteric and so specific to the problem that there was basically hardly any chance that there would be repeat work that built on it," he said.
"But I can say for certain that Adam's stuff with Coq, and separation logic, is stuff that's going to get built on and applied in many different domains. That's what's so exciting."
The research is due to be presented at the Association for Computing Machinery Symposium on Operating Systems Principles in California in October.