Sometimes big data seems almost synonymous with Hadoop and NoSQL because these are the technologies that are driving the headlines. However, the reality on the ground is a little different.
During a research programme for a recent Big Data Summit, Computing asked 230 survey respondents which vendors they would consider for a hypothetical big data project.
Those choosing the headline-grabbing specialist big data players as potential partners are in a minority compared with those choosing to stick with more familiar names such as Microsoft, Oracle and IBM. Confronted with a list of generalist and business intelligence (BI) vendors (figure 1) as potential partners only 14 per cent chose the “none of the above” option compared with 56 per cent who did the same with specialist vendors (figure 2).
[Click on image to enlarge]
On one level this could all be a matter of semantics. After all, what exactly constitutes a big data project? For many organisations it will simply mean more of the same but on a larger scale, rather than the revolutionary break with the past represented by Hadoop, NoSQL and associated technologies.
A lot of rebranding – or “big data washing” – has taken place over the past two or three years, with common-or-garden storage, database and BI products suddenly finding themselves on the big data shelf. That said, the giants have been on an extended shopping spree for promising start-ups to make up for lost time, and have also released innovations of their own, such as SAP’s HANA in-memory database, Microsoft’s Azure HDInsight cloud-based Hadoop platform and Oracle’s Exadata appliance.
[Click on image to enlarge]
Safety in numbers
It’s sometimes easy to forget that this is still an immature market and that most of the organisations currently making a big noise are small start-ups. The leading pure-play Hadoop distributors such as Cloudera, Hortonworks and MapR each have fewer than 500 full-time employees. They will argue, of course, that the support and development networks represented by their thriving developer communities effectively makes them several times larger, but the fact remains that they are tiny compared with the behemoths heading up the list in figure 1. Microsoft, for example, has a global headcount of 127,104 while IBM employed 431,212 people at the start of 2014.
The old adage goes “no one ever got fired for buying IBM”, and the same could apply to Microsoft, Oracle or EMC. Their global reach means that these vendors have a presence in most organisations of any size, and as a known quantity these incumbent IT giants are often the first port of call, a safe bet when a new project is in the offing. Whether they are always the best choice is another matter.
The technical capabilities, or lack of them, within many organisations is another reason to play safe. Specialist big data vendors can generally only provide one or two pieces of the big data puzzle, with the remainder needing to be provided by internal staff or third-party suppliers. With their industry certifications and armies of qualified resellers, the big players offer a comfort blanket to those lacking specialist skills.
The skills gap is still a very real issue. Computing asked respondents which big data solutions they were using and then asked them whether they had in-house skills for these solutions. The biggest gaps existed where respondents were using Hadoop and cloud-based big data services.
In-memory databases applied to big data and next-generation data warehouses/data analytics were also areas where businesses were struggling to recruit and keep skills.
An end-to-end-type solution provider will to be able to fill in more of these gaps from its own product line, such as data warehousing, infrastructure-as-a-service and business intelligence, in a way that builds on what is already in place in many organisations rather than having to start afresh.
When it came to actually making the decision to trial and/or purchase a solution Microsoft was the clear winner. Nineteen per cent of those with a live project (a quarter of the respondents had something up and running) chose Redmond as a partner. Second was Oracle with 11 per cent closely followed by SAP with nine per cent, while EMC and IBM were each selected by eight per cent of respondents.
Mighty sharks and minnows
The numbers choosing a specialist vendor for a hypothetical big data project were smaller, as figure 2 shows. Added to the skills and incumbency factors, the current volatility in the market means there is bound to be a certain amount of waiting and seeing going on as potential customers seek assurance that their choice will still be around two or three years down the line.
However, for those who see big data as a break with the past rather than same-but-bigger, who wish to deploy it for a specialised task, who are able to draw on appropriate skills, and who are happy with integrating various vendors, there is no doubt that the smaller players are worth more than just a cursory look. It is here that most of the progress is being made. These are the innovators who are making the waves, garnering investment from venture capitalists and IT giants like Intel and IBM alike.
Small and agile, and with plenty of resources – both financial and in terms of their fiercely loyal developer communities – some respondents believed that these vendors offer a degree of flexibility that larger players could not match.
“Although a larger company could offer you a larger suite of services, a smaller company would be more willing to accommodate you with your changing needs as sometimes you won’t know all your requirements and you might discover them as you go. You might not get that from an Accenture or an IBM…” said a CIO in the financial sector.
Top of the list of specialist vendors that respondents would consider were Splunk, the logfile analytics firm, MongoDB, one of the first NoSQL databases to gain a strong following, BI and data visualisation company QlikTech and the Hadoop distributor recently backed by Intel, Cloudera.
Fourteen percent of the sample had gone as far as setting up trial or production implementations with these vendors, among which Splunk, QlikTech, Hortonworks and MongoDB were the front-runners.
There was little doubt among the respondents that despite the ongoing consolidation of the market, with big fish swallowing minnows, a real change has taken place.
“I think if you had a large operational IT system and if you didn’t use Oracle or one of the big players, you’d sort of get that proposal written off. Now, it’s almost the other way round, as soon as someone sees Oracle, they see the license price and the infrastructure price going up and it’s just like ‘no way’,” explained a CIO in the gaming industry.