How DDN big data storage helps the British Antarctic Survey

BAS support engineer Jeremy Robst tells Computing how DDN flash storage aids scientists conduct research into Antarctica

It's frozen, hostile and remote, but the Antarctic wilderness represents a potential goldmine of scientific discovery.

It's therefore the mission of the British Antarctic Survey (BAS) - part of the Natural Environment Research Council - to conduct research in the polar region with the goal of improving knowledge about the planet and the processes and risks that occur due to natural and man-made phenomena.

Hundreds of scientists are stationed across five bases and two ships in British Antarctic Territory, and Jeremy Robst, IT support engineer and head of Unix systems for BAS, is responsible for maintaining the technology the researchers need - a job he's been doing for the past 18 years.

With three months of every year spent on the RRS James Clark Ross in the South Ocean and the rest of his time spent at BAS's Cambridge headquarters, his role, he tells Computing, is "to make sure all the data and systems are running to enable scientists to collect and analyse their data".

There's a lot of data to collect, with each individual BAS probe collecting up to 10GB an hour. Operations include multibeam echo sound systems collecting the typography of the sea floor, instruments sampling the atmosphere and automatic weather stations collecting information throughout British Antarctic Territory.

The nature of the research, combined with the often very harsh conditions it's being carried out in, means the collected data is extremely precious because, as Robst explains, "you can only collect it once".

"If you want to go out there and find out what's happening at a certain point in time, that's it, you can't go back and collect it again in a lot cases and if you can, it's expensive to run this sort of organisation," he says.

The data is sent back to Cambridge and entered into databases so researchers can crunch and analyse it using high-performance computing (HPC) clusters.

"All of these systems generate vast amounts of data that has to be stored," says Robst, adding the amount of data stored is "increasing by 75 per cent every year". When its previous storage system began buckling under the strain, BAS went looking for a replacement that could cope with the huge volumes of data its scientists were generating, but without a hefty price tag.

"We're government funded, so we haven't got vast amounts of money to throw around, so we needed something that gives us a lot of storage, will do it in a reasonably compact physical space, fits in with our current systems and gives good performance," says Robst.

For BAS, it was hybrid flash storage supplied by DataDirect Networks (DDN) - a finalist at this year's Computing Vendor Excellence Awards - that offered the "right combination of features and capacity for our budget".

Robst says this scalable solution is now providing scientists with an improved capacity to use HPC systems for crunching big data.

"We're using the new DDN storage to give people better access to disk storage to improve the model processing on the computer cluster," he tells Computing, before describing how this can help further research.

"With an increase in disk performance, scientists can look further ahead in the future or they can run multiple copies of the models with tweaked perimeters to investigate what happens in different scenarios and so forth," Robst says.

British Antarctic Survey installed the DDN SFA7700X hybrid flash storage appliance in April and there's been "no issues at all" since then, and the project has "proven to be very successful so far" for the scientists who are running models that can take three months at a time to complete.

If extra capacity is required, those simulations can still run without being shut down, and according to Robst, upgrading the flash storage system is simple.

"As we come to need more storage we can just buy an extra pack of 10 disks and just swap those as it runs. We can even add additional storage to the main system as it's running," he explains, adding that this will be invaluable in future.

"Over the next two or three years as the ability becomes available, we can just add capacity without stopping work," he concludes.