When data scientists were likened to Yetis by HP’s big data solution marketing director, Dan Wood, at Computing’s Big Data Summit – it wasn’t because of their diet or breeding habits, but their rarity. Such is the extreme competition to find someone with this skill set, that companies around the world are offering staggering six-figure sums to attract the right candidates.
Recruitment firms, not surprisingly, are rushing to meet this demand. Just last week, a Harvard-backed start-up dubbed Experfy launched as a marketplace for contract data science projects.
Meanwhile, a number of vendors are creating data science courses or “tracks” that could become a source of talent for organisations looking for people who can analyse big data.
But would firms be happy to hire a data science contractor that may not have worked in their industry before – or are the skills of a data scientist easily applied to all sectors?
Forrester Research analyst Mike Gualtieri believes a good data scientist can apply their expertise to any field. He gave the example of the “recommendations” section of Netflix, and said that it was mathematicians who were behind the development– not people with an intimate knowledge of Netflix’s audience.
But the president of Hadoop distributor and developer HortonWorks, Herb Cunitz, said that a data scientist should have a good understanding of the sector they are working in.
He advised enterprises to get those who understand data science tools to team up with colleagues who know the questions that the organisation wants answered – unless they have people on the payroll who can do both.
Indeed, James Robbins, CIO of Northumbrian Water, told delegates at the Big Data Summit that his firm has put together a team comprising people with different skills.
“We’ve tried to blend a team together; it is about looking at the whole team to see if you have the relevant skills,” he said. “You could have one person who specialises in the technical area, another who specialises in the business area, and another who has expertise elsewhere, and they can work together.”
Forrester’s Gualtieri believes that this is a valid approach to ensure that the organisation has all of the skills necessary in general enterprise terms, but suggested that in specialist areas such as drug discovery and genetics, the data science individual or team would need “some very specific domain knowledge”.
Rexer Analytics, a Boston-based analytics firm, found that 70 to 80 per cent of data scientists’ time is spent preparing the data before analysis. The analysis itself is more about the technology involved. And Gualtieri claimed that if an organisation hired a data scientist, he or she wouldn’t know where all of the data sources are coming from, who to talk to within the business or where to start. He suggested that the organisation should team up the data scientist with someone who understands all of the information and management architecture, so that they can help acquire all of that data, leaving the data scientist to do all of the so-called specialist work.
But if expertise in any given vertical doesn’t matter that much, then isn’t it only a matter of time before the data scientist role becomes automated?
Gualtieri certainly believes so.
“There are tools like SAS Institute’s visual analytics and statistics products that enable casual business users to do amazing things. Of course, not all of data science can be automated, as many data scientists actually write algorithms, but for a broader technical-based person – it is a possibility,” he said.
But the technical expertise of a data scientist is not the only thing that is sought after. EBay’s head of EU analytics, Davide Cervellin, wants candidates with the right soft skills as well.
“It can’t be something you’re detached from; I hate people who are all about numbers only, and so I look for people with strong soft skills – they need to be able to maintain a good level of conversation with executives, telling them what the constraints are, but also helping them to understand what the end product is going to look like before they start working,” he said.
But while the likes of eBay and Northumbrian Water lamented the lack of talent available, Alex Jaimes, director of research at Yahoo, suggested that organisations are too quick to seek outside help that they may not actually need.
“It depends on what kind of tasks you have: if you break up a particular space and segment it then maybe it is not as [hard to recruit] as you think. Many people say I need to hire a data scientist but they don’t do this first,” he said.
Sometimes, the power of the mainframe is the most cost effective answer. Computing's Peter Gothard puts Computing's readers' questions on the future of the mainframe to IBM's Z13 expert Steven Dickens.
This Dummies white paper will help you better understand business process management (BPM)