No more ETL: Take the analytics to the operational data instead, urges IBM
Analytics is increasingly being absorbed into, and becoming a full part of, organisations' business processes, claims IBM's Alex Chen
The days of extraction, transformation and loading data from operational systems into business intelligence systems for the purpose of analytics may well be over, according to Alex Chen, director of file, object storage and big data flash at systems giant IBM.
Instead, he says, the analytics technology ought to be brought to the data.
Speaking at today's Computing IT Leaders' Forum, which focused on the issue of big data analytics and storage, Chen suggested that with analytics engines increasingly being used to make business decisions on-the-fly, it made sense to conduct that process as quickly and, therefore, as close to the data as possible.
"You don't want to move the data off [of your operational systems] and put it in an analytics engine. Bring the analytics engine to the data, perform various analytics to the 'data ocean' and keep it and back-up that data ocean. That sort of architecture, that sort of thinking, is starting to be adopted by many enterprises in the big data environment," said Chen.
On top of things like Hadoop, two of the key technology drivers of this trend, he suggested, were the performance improvements arising from affordable Flash storage, combined with in-memory database technology. Both developments aren't necessarily rendering conventional disk-storage redundant - far from it - but have firmly moved it down the pecking order in organisations' storage hierarchy.
But at the same time, though, organisations need not just keep the data, but also the associated metadata generated by those analytics activities, he warned. "Performing analytics generates a lot more metadata due to regulations. You can't just throw data out any more, in a lot of cases. That data has to be encrypted as well," said Chen.
One IBM customer, added Chen, claimed to have three billion files, making backup alone a challenge. "Imagine. You have three billion files and the software has to figure out what changed!" said Chen. IBM Spectrum Scale software-defined storage, he added, is able to do that, and perform the back-up, in a matter of hours.
Other IBM big data customers include aero-engine maker Pratt & Whitney, whose big data focus is very much on engine telemetry. It also delivers IT infrastructure to airline Lufthansa following a 10-year outsourcing deal signed in 2014.