Don't let your data scientists get bogged down in 'tedious work' - automate!

Craig Parfitt, VP of global insight services, Calligo, argues that data scientists should be deriving insights from data, not building models

Organisations often allow their data scientists to get involved in unnecessarily laborious tasks which can and should be automated.

That's the opinion of Craig Parfitt, VP of global insight services at Calligo, who was speaking at Computing's AI & Machine Learning Live event recently.

"Data scientists are usually domain experts, they really understand the data," Parfitt began. "But they get involved in some tedious things which can now be automated," he added.

That's a particular problem given the salaries experienced data scientists can command, with figures above £150,000 per year often cited.

"If you've designed how you'll use an algorithm, that can go into a process where those models get built automatically. And the structure of the model changes based upon the data you use."

This capability means that data scientists shouldn't be spending their time building algorithms or models, but instead driving insights and new learnings, according to Parfitt.

"That allows you to free up the data scientist to do what they're good at - understanding new areas for the business. We can automate the tedious stuff so they don't get sucked into just playing with the data."

He explained that this strategy involves using 'design thinking', a creative problem-solving approach.

"We use design thinking to do this," said Parfitt. "We get different parts of the business together, and they're tasked to come up with an approach that really suits the organisation.

"That's very different from the more commons scenario where someone from the business has an idea, works on it for a while then has to go and get IT involved towards the end," he stated.

At the same event, Dr Patricia Charlton, Senior Lecturer at the Open University, stated that the skills gap is the biggest barrier to AI adoption, but argued that it doesn't have to be.

Parfitt also discussed the explosion in data volumes, giving the example of a person he worked with recently who was concerned by ever larger data feeds coming in via 5G.

"How can you ingest all the data you need, and how can you understand what you have already? These problems are set to get harder. Someone from Vodafone said to me 'I'm not worried about big data any more. When 5G switched on I'm worried about vast data'."

The same person told Parfitt that he no longer felt able to store all of this data, as it is so vast it has become too expensive to retain.

"So you don't store it, you strip mine it and take out what's relevant as we now know what we need - we've had 15 years of working with these data sources," Parfitt explained.