Why humans are still an important part of AI at Dow Jones

Human oversight ensures accuracy and trust, says Dow Jones’ head of data strategy Ingrid Verschuren

As a publisher, Dow Jones' USP is its ability to break down reams of complex information into something simple and understandable. As well as skilled journalists, the company's work relies on data strategy professionals who can find and serve important market information to researchers. The role has changed a lot in the last decade, says head of data strategy Ingrid Verschuren, and AI and automation are taking a leading spot.

Verschuren joined Dow Jones in 2000, working with Reuters and then Factiva. At first her role, indexing news articles, was very manual; but the job was automated three years after she started.

"It was a very good example of showing how you could use automation to become more efficient at a job; but then secondly, also really to show that automation doesn't necessarily mean there's going to be job losses, right? I'm still here, and a lot of the people that started with me at the time are still here as well."

Ingrid Verschuren, Head of Data Strategy, Dow Jones

Today, Verschuren leads the data strategy team in normalising, transforming and enriching data, as well as creating proprietary data sets for the risk and compliance business. AI remains an important part of her role - indeed, she says, it would be "impossible" for the team to do their job without it.

"Currently how we use AI is still very much using [natural language processing], because we have vast amounts of text - so unstructured content - that we need to process on a daily basis. We use Factiva… It's a news archive that contains 32,000 sources; it processes close to a million news articles a day. It would be impossible for a human to read all those news articles, so the way we use NLP is to extract relevant information from those articles, which serves as a signal to then update or create new profiles on our risk and compliance database accordingly."

An example might be extracting information from an article about a new British ambassador being appointed to Spain. A researcher then confirms the information is correct and performs additional research before adding it to the Dow Jones database, where clients can access the information for their own use.

Verschuren believes human oversight like this is still absolutely necessary when using AI today, especially when the information it extracts is used for business-critical decisions:

"Ultimately, technology is only as good as the data it is using… Your technology can be amazing, but if the data isn't accurate then it doesn't really matter."

Her opinion reflects Dow Jones' own philosophy. The company puts a high value on the work of its data strategy and research teams, who decide whether extracted information is reliable.

Dow Jones performs research in more than 60 languages worldwide and considers cultural context - an area AI struggles with - key.

This presents a unique challenge. Where possible, Dow Jones prefers to use the original language in its database entries; but non-Latin scripts like Chinese make recognising and extracting entities, like company names, difficult.

"If I can extract the Chinese name of a Chinese company directly, I know I will have the correct name, whereas if I have to translate that article then the company name gets translated as well. The challenge with that is you have to develop the model for each language.

"You never know what the next language is going to throw at you; languages that seem to be really easy are not, or the other way around. I would say that counts as the biggest challenge."

Technology may present an answer in the future - such as extracting entities from the original source and translating the non-entity information using an English model - but that envisioned hybrid solution is not there yet. Instead, Dow Jones relies on collaboration between subject matter experts and the teams who build AI models.

Human oversight is still an important part of working with AI, says Verschuren:

"Very often a human will find something that a machine doesn't necessarily find. There are so many examples that we've come across over the years where that has proven to be right.

"If you think about the use case, which is those financial institutions using our data to make sure that they continue to comply with compliance requirements…accuracy is key, it has to be right; so having the human oversight at the end really helps."

Sense-checking AI outputs is not only necessary for accuracy; it also helps to improve trust. When Dow Jones transitioned its risk and compliance database from purely manual to using AI, it made sure to keep humans involved.

"What we saw was if we had taken the human completely out of the loop, that was the point where we would have sacrificed quality. That was not something we wanted to do, so we made the decision to keep the human expertise in the loop. That is really how you ensure that [AI trust] - and then, again, the source selection, the data selection at the start as well…

"I still don't think enough time is being put into the data people use. That is how you can, ultimately, make AI more successful."

Technologies like artificial intelligence are changing every industry, at a larger scale and higher speed than ever before. At the same time, those technologies continue to become more accessible and more user-friendly. Verschuren's closing advice is to ensure you remain aware of what's going on outside your own four walls:

" I think the future is a future where you constantly evaluate what new technology becomes available to you. If you compare what is available now, even compared to five years ago, it is so much better and more accessible. That makes a big difference. Based on that, you really re-evaluate what you're doing and make use of new technologies. You can never stand still; you constantly have to evaluate what a machine has learned to do and think about how you can incorporate that - without giving up the quality of your end data."