Bad data made Amazon's AI biased against women

Amazon had to scrap an automated candidate selection tool because it had learned to be sexist

It is no secret that Amazon relies on process automation to dominate the e-commerce market, in areas from warehouses to pricing. Recruitment is one place where it hasn't succeeded, though - and could have ended up hurting the firm.

Amazon began working on automating recruitment in 2014, with an eventual aim of being able to feed in CVs and the system choosing the top candidates.

However, Reuters says, after about a year the company realised that the tool was not rating candidates in a gender-neutral way, due to the data it had been trained on.

The computer model was trained by observing patterns in the CVs candidates had submitted to Amazon over a decade-long period - most of which came from men. The system thus drew the assumption that male candidates were preferable, and began to downgrade CVs containing the word ‘women'.

Amazon edited the programme to make it neutral again, but there was no guarantee that it wouldn't discover new ways to discriminate against female candidates.

The project was only ever used in a trial phase, not independently, and Amazon closed it in early 2017. The firm says that this is because the system was never perfected and did not return strong candidates for the roles - not because of the bias issue.

An Amazon spokesperson reiterated to V3 that the system "was never used by Amazon recruiters to evaluate candidates."

The importance of training data

This is not the first time that aspersions have been cast on AI systems based on bad training data; last year a report found that the COMPAS risk assessment system in the USA thought that black people were more likely to reoffend simply because they were black. This situation can turn into a cycle of reinforcement, where biased data marks one group or location as high-risk for crime, leading to a higher police presence and more arrests.

Earlier this year, we covered the story of Admiral Insurance providing higher insurance quotes to people called Mohammed, which solicitor James Kitching highlighted as a problem with bad data

Olivier Thereaux, head of technology at the Open Data Institute, writes: ‘The key to the AI's inner-working resides in the training data: the bias in what is included, as well as what is not, can sometimes be translated into prejudicial systems, as engineers unknowingly encode historic and current data into algorithms that maintain the status quo, reflecting our current economies and societies.'

Amazon has learned from its earlier experiment, and is now using a "much watered-down version" of the engine in its recruitment process. According to one source, a new team is working on the problem again, this time with a focus on diversity.