Many organisations have been using data warehouses and business intelligence software over the years, but the underlying information sets have grown so much in volume and complexity that in some cases, existing hardware and software is no longer able to handle it efficiently.
Step forward “big data”, a term that has evolved to describe exceptionally large, diverse repositories of digital information and the processes organisations use to organise, search and extract the data they contain.
Never slow to spot a revenue opportunity, many of today’s IT giants are big on big data, especially where demand for massively parallel processing (MPP) systems capable of handling those huge databases dovetails neatly with these vendors’ existing server and storage hardware platforms.
Evidence of how highly those vendors rate that opportunity can be seen in EMC’s cash purchase of data warehouse specialist Greenplum last year, for example, with rival IBM acquiring business intelligence and analytics company Netazza for $1.7bn in September 2010, and HP grabbing Vertica for an undisclosed sum last March.
Software giants have also got in on the act, with data warehousing specialist Teradata buying analytics vendor Kickfire in 2010, paying $525m for marketing automation software company Aprimo in the same year, and taking full ownership of Aster Data Systems in March 2011 for $264m.
Many other companies, including Microsoft, Oracle, SAP and Endeca are looking to sell enhanced database, analytics and business intelligence tools based on the big data concept, though the very definition of the term tends to be manipulated to play to individual product strengths in each case, meaning big data remains a moving target in many respects.
The data deluge
Handling large data sets is certainly a real problem for many organisations, and it’s one that’s getting bigger by the minute. IDC’s last Digital Universe study estimated that the total volume of data being stored in the world will reach 35ZB (one zettabyte is equal to a trillion gigabytes) by 2020, although much of that will be stored in personal, rather than corporate, systems and not used for business analytics or reporting purposes.
Of more relevance to this particular discussion, perhaps, is recent research from McKinsey Global Institute (MGI) which estimated that organisations across nearly all sectors in the US economy had at least an average of 200TB stored somewhere within their IT infrastructure, with many storing more than 1 petabyte.
Complexity and speed
Some industry experts, including Microsoft CEO Steve Ballmer, believe that big data should focus less on size and more on the type of data being processed and analysed, including information stored outside the corporate firewall.
Data being searched for analytical and reporting purposes could be anything from internet text, search indexes, call records, medical records, digital images, high definition (HD) video archives, surveillance footage and e-commerce transactions, for example, as well as datasets created by academic, scientific and research departments or by development projects that process large volumes of information.
And all of that information could be unstructured, or distributed in flat schemas with little or no cross-reference relationships, and could also involve time stamped events extracted from log-files, sensors and social networks.
“The true challenge is not one of big data but the more complex issues across all dimensions of information management… variety, complexity and velocity of data are equally significant,” wrote Gartner analyst Stephen Prentice in a research note published in May.
Is anyone buying it?
While it is difficult to gauge real-world demand for big data hardware and software platforms, potential uses for the technology are easy to identify.
The McKinsey Global Institute has highlighted several ways that mining all that information can help organisations, like reducing search and processing times to speed up time to market or service delivery initiatives, getting more accurate intelligence or statistics out of the information being stored to minimise risks and improve business decisions, and using the data obtained to develop new products and services.
“We estimate that a retailer embracing big data has the potential to increase its operating margin by up to 60 per cent,” wrote the MGI. “Big data will also help create new growth opportunities and entirely new categories of companies such as those that aggregate and analyse industry data about products and services, buyers and suppliers, consumer preferences and intent.”
The advantages of analytics in pinpointing real-time trends and forecasting customer behaviour in the retail, e-commerce and financial services markets are well understood. But Gartner’s Prentice also identified how public sector organisations and utilities can improve service delivery by analysing data from a much broader range of sources, including those not necessarily controlled by the organisation itself.
“By sweeping away limitations derived from data restraints and exploiting a growing universe of publicly available data, a whole new era of digitally accelerated business models is emerging,” he wrote.
Have your say on this article
Newsletters
Latest stories from Software
Latest videos
You may also like
Software jobs
Technology Patent Wars
Case studies from large organisations across all sectors
... And rich media, and flexible working, and peaks in traffic ...
Upcoming Events
Join us for this Computing web seminar, in which the Head of BI at the Co-operative Group Nick Colebourn will be explaining just how he reigned in the Group’s sprawling database estate and how significant savings were realised and data quality improved as a result.
Date: 31 May 2012
Time: 11:00 AM
Live June 13th 11:00am: Register now. During this web seminar we will be looking at the sorts of incidents that can bring data centres grinding to a halt and what can be done about them.
Date: 13 Jun 2012
Time: 11:00 am
Receive the latest jobs direct to your inbox
Are you being paid what you are worth?