The tricky matter of starting and maintaining a big data programme was one that the panellists debated during Computing's latest IT Leaders Forum: Understanding how to put big data into practice in London last week.
Four key themes emerged from the discussion:
Pace of change
Most people have little idea what big data technologies are capable of now, let alone what they will be able to do 12 months down the road. This means that they don't know the right questions to ask.
"If Henry Ford had given people what they wanted he'd have given them a faster horse," quipped Alwin Magimay, head of digital and analytics at KPMG.
"The possibilities of big data are so big and the pace of change is so fast that if we ask people what question they want answered we don't even touch the surface of what it's capable of."
A change of mindset is therefore required, not only among line-of-business departments, but also within the IT team itself.
"We had BI analysts who took a good six months to provide an answer to a question. They'd spend two weeks making sure the data was absolutely correct before running it through their models," said Mike Bugembe, chief analytics officer at JustGiving.com.
Now, he said, the two weeks previously given over to cleaning the data is the same timescale between starting a project and seeing the first results.
There was broad agreement on the need to guard against "IT perfectionism" and to embrace a "fail fast and learn fast" mindset, as Magimay put it.
"If we don't do it within 12 weeks we kill it, no matter who screams and who shouts," he said.
This implies an Agile approach to project development.
"Things are moving so fast, so Agile is definitely best way forward," said Bhasker Allene, big data solution architect EMA at Intel UK. "The one big advantage of big data technologies is that they are all based on scale-out architecture, so you can start with one small use case and scale out as you get more."
This approach also allows organisations to incorporate new technologies as they arise, Allene added.
Related to the rate of change is the shortage of big data skills.
"The challenge of Hadoop is that it is made up of 20 or more different components; some 10 years old, some 10 days old," said Alexander Bartfeld, VP of professional services at Cloudera.
"There is a desperate skills shortage, especially in this country," he continued.
"Some of the components of Hadoop are almost plug-and-play now, but with more recent innovations such as Spark, for example, in the UK there are probably only a couple of dozen people who have an in-depth understanding of it."
A multidisciplinary team
Most of the panel agreed that the data scientist role, with advanced skills in statistics, programming and communications, was a hard one to fill for the majority of organisations. Often, they argued, these skills could be brought to the table by different individuals as part of a multidisciplinary team.
JustGiving's Bugembe said that at his organisation the creation of the role of chief analytics officer reporting to the CEO rather than the CIO was crucial. Bugembe also gave a list of job titles that could make up the ideal big data team, including public-facing qualitative analysts and business analysts as well as more process-oriented roles such as BI specialists, developers, statisticians and machine learning experts.
Andrew Clegg, director of learning analytics and data science at education resources supplier Pearson agreed. Previously, he said, there had been a culture gap between the IT and product engineering people at Pearson and the product marketing teams developing new products, but this had changed.
"When I moved into product marketing I started being able to weave data-led decision making into the product pipeline, to ask 'how can we use the data to give us new opportunities that weren't there previously?'. No offence to IT people, but they're not good at that kind of thing," Clegg said.
Strong leaders and evangelists
Strong buy-in from the highest level was seen as vital for the ongoing success of a big data programme, especially to maintain the team spirit and common purpose of the multidisciplinary team. Having someone who understands the possibilities of big data and who can enthuse people about them was also essential.
"Nothing we've developed would have succeeded if we hadn't evangelised about it," KPMG's Magimay explained.
Sometimes, the power of the mainframe is the most cost effective answer. Computing's Peter Gothard puts Computing's readers' questions on the future of the mainframe to IBM's Z13 expert Steven Dickens.
This Dummies white paper will help you better understand business process management (BPM)