How we future-proof our data science at Channel 4

Senior data scientist Alice Jacques explains how networking is the key to staying ahead of the game

Alice Jacques, senior data scientist at Channel 4, is a physicist by training. She is now responsible for finding new ways to use audience data, such as in better targeting adverts in the channel's online video on-demand service.

Data scientists are natural experimenters but they need to ensure their experiments stay relevant and up-to-date, she told the audience at a dataIQ event in London yesterday. One way that Channel 4 does this is to maintain close links with academia as a window to the future.

"Academics are about five years ahead of what we're doing," Jacques said. "They've tried stuff and rejected it before we've even heard of it."

Most businesses would do well to foster ties with universities, she said. Data science is always moving forward and it's important to future-proof investments.

"I always make sure I go to academic conferences, to listen to mathematicians with unfathomable facial hair going on about matrices," Jacques said. "Not only does it scare the bejesus out of you but it reminds you that academia didn't stop when you left university."

Having a few tame postdocs on the end of the phone to run ideas past is very valuable too, Jacques explained.

"I can say 'I'm thinking of doing this' and they'll think about it and say 'well, that doesn't sound impossible'," she said.

That said, academics tend not to be so good at dealing with messy real-world business data. For that you need to network with techies.

Channel 4 has moved over the past five years from a system based on SPSS, through Hadoop and Hive, and now has "Python on Spark and some Scala and a load of new core stuff coming down the line".

"Future-proofing tech is really tough, it's a big investment that you really don't want to get wrong," said Jacques. "Any data science or coding success is intrinsically linked to your tech."

Most data scientists don't need to be on the technological bleeding edge, but they don't want to be inheriting any technical debt either. It is important to keep moving forward with the technology.

"I find out about tech mostly from meetups," Jacques said, mentioning PyData and Data Science London specifically.

"People talk about business problems they've solved, and packages they've used over pizza and beer. It's all very cheerful."

Meetups are useful for finding out about "small incremental advances in technology that you already have", she added, saying that the real value often comes from learning from others' mistakes: "They can save you a fortune in choice of technology."

Internally too, it is important that data scientists mingle with the right people.

"The big data engineers who are always on the internet and talking about this stuff, they really know what's going on and it makes them the perfect in-house experience so long as they don't sit in the basement with the IT crowd."

At Channel 4 an effort has been made to avoid creating such silos, Jacques said. She also mentioned hackathons and time set aside for R&D as other useful ways to future proof the data science function.

The spirit of open source is abroad in the field of data science. Most of the tools are free collaborative efforts and communication takes place over Slack and GitHub. This has advantages in that you can access "the best minds in the world", Jacques said, but there can be a culture clash when it comes to the business.

"The open source ethos can put your team at odds with more commercially minded managers. You have a data science team who want to keep contributing to the good vibes but also the business that wants to protect the intellectual property."

A balance needs to be struck between patenting novel developments but releasing other code on GitHub to keep the collaboration going.

So do you need to be a physicist or a mathematician to be a data scientist? A CIO recently told Computing that sociologists and economists might fit the bill.

"Most data scientists are maths geeks who want to become computing geeks or coding geeks that want to become maths geeks. But I'm seeing other types of people coming in now, for example psychologists and biologists, particularly with the rise of bioinformatics," Jacques told Computing.

But certainly you need to be numerate, she said, adding that many of her colleagues from academia are now looking at data science as a career path and retraining. It's a course of action she would recommend.

"I bloody love it," she said. "There's no such thing as a job as a physicist or mathematician outside of academia. Data science is one of the few refuges where you can use all your understanding as a physicist and mathematician but in a business environment.

"If you have a scientific mind you need a job where you get stuck every day, and I'm lucky that I do that because I'm interested in people's viewing habits as well as the science."