Open-source software framework Hadoop is touted by many as the next big thing; a technology that will revolutionise the drive to find value in big data.
Market intelligence company Transparency Market Research estimates the global Hadoop market will grow at almost 55 per cent a year, to be worth $20.9bn by 2018. Research firm IDC predicts that it’ll be worth $50bn by 2020.
The latest evidence of the framework’s impact was Intel’s $740m investment in Hadoop distributor Cloudera, which gave the chip giant an 18 per cent stake in the company.
But how will this deal change the Hadoop market? Should Cloudera’s rivals be alarmed by the company’s new financial muscle, or be celebrating the fact that the Intel/Cloudera deal is the clearest sign yet that Hadoop is rapidly becoming a mainstream technology?
“It’s great validation of the space,” said Herb Cunitz (pictured), president of Cloudera rival HortonWorks, adding that such a move by Intel sets up an interesting dynamic within the industry.
“Cloudera has laid out a strategy that says: ‘We’ll compete against the other distribution vendors like HortonWorks, MapR, and we’ll compete against the whole big data warehouse space – the likes of Teradata, IBM, Microsoft, SAP and Oracle – so they are going through two battles at once,” he said.
This means that Intel will now be competing in a space against many of its partners in its chip manufacturing business.
So did Cloudera attract Intel because it is the superior Hadoop distributor? Perhaps, but it’s certainly more popular than Intel’s own efforts in the space. The chip manufacturer initially tried to push its own Hadoop distribution, but it failed to gain traction in the marketplace.
Which has all worked out well for Cloudera. Or so you might think. Forrester analyst Mike Gualtieri, however, disagrees.
He doubts that Intel’s investment will have a material effect on Cloudera’s competitiveness, or that it will give the firm an upper hand in the market.
However, Gualtieri suggested that Intel could make some acquisitions in the hardware and appliances spaces to complement its Cloudera purchase, and that this could give it an edge in the long term.
One of the key differences between Cloudera and its rival HortonWorks is that Cloudera wants to eradicate the need for a data warehouse, by solely using Hadoop. HortonWorks has placed its bet on data warehouses staying right where they are.
“It’s not going to die and go away; customers aren’t asking us how to replace their data warehouse. There are some tools like reporting in real-time that you don’t want or need Hadoop for,” Cunitz said.
Taking a swipe at Cloudera, he added: “It’s about putting the right tool in for the right job, rather than saying ‘we’ve got the magical tool that can replace everything’. Getting back to reality, we haven’t seen anything that comes in and wipes out the entire space.”
Gualtieri, however, described Cloudera’s data warehouse-less vision as “valid”, but not viable today. For now, the concept is no more than “enormous marketing hype”, according to Gualtieri, for the most part because of the cost of getting Hadoop to perform like a data warehouse.
“It is probably possible to get the same performance as a data warehouse with Hadoop at some point, but it is going to cost you a lot more because you’re going to require more nodes,” he said.
But not all end-users are convinced, or perhaps even bothered about the differences between the various Hadoop data platforms.
Dutch bank ING Bank trialled Cloudera’s platform before deciding on using HortonWorks, but the head of the solution delivery centre at ING, Anurag Shrivastava, explained that either solution could have worked.
“Hadoop is just Hadoop, whether you take it from Cloudera or HortonWorks – it doesn’t always matter what their strategies are; we have to use our brains too,” he told Computing.
But while the technology itself may not be vastly different, Shrivastava explained that HortonWorks’ big community and approach to open source were positives, while ING Bank steered clear of Cloudera because it felt that the firm may “come up with something that is not open source”, which could be a “trap for ING in the long term”.
It is these key points that make for interesting years ahead in the battle for Hadoop data platform supremacy.
Read the follow-up features on this topic:
You can find out more about the approach of several high profile firms to big data in this video:
Sometimes, the power of the mainframe is the most cost effective answer. Computing's Peter Gothard puts Computing's readers' questions on the future of the mainframe to IBM's Z13 expert Steven Dickens.
This Dummies white paper will help you better understand business process management (BPM)