CERN experiments generating one petabyte of data every second

Organisation storing 25PB of data every year as Large Hadron Collider delves for secrets of the universe

GENEVA: Experiments at CERN are generating an entire petabyte of data every second as particles fired around the Large Hadron Collider (LHC) at velocities approaching the speed of light are smashed together.

However, Francois Briard, control infrastructure section leader, beam department, explained that CERN doesn't capture and save all of this data, instead using filters to save only the results of the collisions that are of interest to scientist at the facility.

"We don't store all the data as that would be impractical. Instead, from the collisions we run, we only keep the few pieces that are of interest, the rare events that occur, which our filters spot and send on over the network," he said.

This still means CERN is storing 25PB of data every year - the same as 1,000 years' worth of DVD quality video - which can then be analysed and interrogated by scientists looking for clues to the structure and make-up of the universe.

Sending and storing the data requires a huge effort on the part of numerous firms beyond CERN, as Jean-Michel Jouanigot, IT computer systems group leader, explained.

"To analyse this amount of data you need the equivalent of 100,000 of the world's fastest PC processors. CERN provides around 20 per cent of this capability in our datacentres, but it's not enough to handle this data," he said.

"So, we have worked with the European Commission to develop the Grid, which provides access to storage and computing resources in the same way the web provides access to information, so we can store and access the data we create on this system."

There are 11 datacentre providers offering access to CERN on the Grid including companies in the US, Canada, Italy, France and the UK, and they in turn utilise storage from a further 130 locations, to ensure the wealth of data generated can be retained.

The data comes from the four machines on the LHC in which the collisions are monitored - Alice, Atlas, CMS and LHCb - which send back 320MB, 100MB, 220MB and 500MB of data per second, respectively, to the CERN computer centre.

Briard also revealed the organisation had recently managed to capture and monitor anti-matter for 15 minutes, a vast improvement on the mere billionths of seconds that it had previously managed, adding that this involves a unique method of analysis.

"We can only trap anti-matter by ensuring it doesn't touch any matter, so we use magnets to suspend it in a vacuum, and we can only see what we had after it's gone by measuring the radiation it leaves behind when it reacts with matter," he said.