Google Captchas caught out by security researchers

Security researchers have devised an automated attack that can crack Google's reCaptcha security system - also used by Facebook - more than 70 per cent of the time.

The researchers tested their attack methodology on more than 2,235 Captchas, cracking 70.1 per cent of them in an average time of just 19.2 seconds. On Facebook's Captchas, the researchers were even more successful, cracking 83.5 per cent of just over 200 Captchas.

Furthermore, the researchers estimate that it might be economically worthwhile for attackers to deploy a system to crack Captchas. An automated system, they suggest, would cost $110 per day, per IP address to run, cracking about 63,000 Captchas in 24 hours - and they shouldn't get detected and blocked, either.

The research was conducted by Suphannee Sivakorn, Iasonas Polakis and Angelos Keromytis in the Department of Computer Science at Columbia University, New York.

In their research paper, I Am Robot: (Deep) Learning to Break Semantic Image Captchas, they explain: "The [Google] reCaptcha widget... performs a series of browser checks for detecting the use of web automation frameworks or discrepancies in the browser's behaviour. The checks range from verifying the format of browser attributes to more complex techniques like canvas fingerprinting.

"Nonetheless, we built a system that leverages a popular web automation framework and still passes the checks. Furthermore, following our blackbox testing, we identified design flaws that allow an adversary to trivially "influence" the risk analysis process," they wrote.

One of the design flaws include a lack of checks for the cookies that ought to be associated with a particular system. "Since tracking cookies have not been previously used by attackers, no safeguards exist for preventing their creation at a large scale; we create more than 63K cookies a day from a single host without triggering any defenses," they wrote. "Using these cookies, our system can maintain a solving rate of 52K-60K checkbox captchas per day, from a single IP address."

The researchers also built their own Captcha-breaking attack that extracts semantic information from images for solving the challenge - as well as using Google's own search tools as part of their system.

"With the use of image annotation services and libraries, we are able to identify the content of images and select those depicting similar objects. We also took advantage of Google's reverse image search functionality for enriching our information about the images.

"We further leveraged machine learning for our image selection, and develop a classifier that processes the output of the image annotation systems and searches for subsets of common tags that occur across images with similar content," they explained.

The researchers also made a series of recommendations for tightening up Captchas, which prevent attackers from creating email and other accounts en masse, and filling message boards with spam.

These include regulating the number of challenges by tying them to a service account, assigning values to cookies based on "confidence" or "reputation", as well as regulating the number of cookies that can be created within a particular time period, and browser checks to detect, for example, a mismatch between the detected browser and what is reported in the User-Agent string.

Security is a key to the Internet of Things. Join Computing in May for the Internet of Things Business Summit 2016 where security will be one of the issues discussed. It's free for qualifying IT professionals - but places are going fast.

Computing's Enterprise Security and Risk Management Summit 2016 will be taking place on November 24. For more details, please see our dedicated website.