May 31st, 2008
Microsoft’s CAPTCHA successfully broken
Jeff Yan and Ahmad Salah El Ahmad, at the School of Computing Science, Newcastle University, England recently published a research paper entitled “A Low-cost Attack on a Microsoft CAPTCHA“,
demonstrating how they’ve managed to attack the Microsoft’s CAPTCHA
used on several of their online services such as Hotmail and Windows
Live, with over 92% recognition rate. Here’s a summary of the research :
In this paper, we analyse the security of a text-based CAPTCHA designed by Microsoft and deployed for years at many of their online services including Hotmail, MSN and Windows Live. This scheme was designed to be segmentation-resistant, and it has been well studied and tuned by its designers over the years. However, our simple attack has achieved a segmentation success rate of higher than 90% against this scheme. It took ~80 ms for our attack to completely segment a challenge on a desktop computer with a 1.86 GHz Intel Core 2 CPU and 2 GB RAM. As a result, we estimate that this Microsoft scheme can be broken with an overall (segmentation and then recognition) success rate of more than 60%. On the contrary, its design goal was that “automatic scripts should not be more successful than 1 in 10,000″ attempts (i.e. a success rate of 0.01%). For the first time, we show that a CAPTCHA that is carefully designed to be segmentation-resistant is vulnerable to novel but simple attacks. Our results show that it is not a trivial task to design a CAPTCHA scheme that is both usable and robust.
Realizing the potential for massive abuse from spammers, the researchers notified Microsoft in Sept, 2007 then awaited the response publishing the paper last month. Even though they’ve scientifically justified their success, the CAPTCHAs used on some of the most popular Internet are known to have been successfully broken in the past, with the CAPTCHA recognition process available on request in a customer-tailed fashion given the specific CAPTCHA. The following is a brief retrospective of some of the do-it-yourself CAPTCHA breaking services, incidents and tools that I’ve been tracking for a while :
- in July, 2006, Fortinet came across Ebay bots that were automatically talking with each other and recommending each other, raising suspicion on the possibly broken Ebay’s CAPTCHA due to the automated registration and posting process
- in March, 2007, Vladuz’s Ebay CAPTCHA Populator became freely available as a browser add-on successfully breaking Ebay’s CAPTCHA. Vladuz has since been arrested
- in September, 2007, I came across a service that was automatically breaking the CAPTCHA of several email providers, and when it wasn’t able to recognize the CAPTCHA it would leave the field blank to be filled by a human, but autogenerate the account names to speed up the process
- in October, 2007, another such DIY service was located, where the only differentiation factor compared to the previous one was its on demand nature, namely, the service whose CAPTCHA should be broken is first submitted for them to analyze, and then figure out how to break
- in November, 2007, another CAPTCHA breaking service became publicly available, this time already able to successfully recognize the CAPTCHA images at some of the most popular Chinese Internet services
- in February, 2008, Websense published a detailed analysis showing Google CAPTCHA breaking in progress
- again in February, Websense released another research, this time demonstrating the CAPTCHA breaking against Windows Live Mail
- in March, 2008, Wintercore Labs demonstrated how Google’s audio CAPTCHA can also be recognized
- according to MessageLabs Intelligence reports for March, 2008, they’ve detected an increase of spam coming from legitimate email services such as Gmail and Yahoo, with Yahoo mail being the most abused Web mail service responsible for sending 88.7 percent of all Web mail-based spam, the reason for which was due to the successful recognition of the CAPTCHAs at these services
All of these developments clearly indicate the demand and supply for CAPTCHA breaking services, as well as the potential for abusing the clean domain reputation of the most popular email providers whose continuous emphasis on usability, namely coming up with more user friendly CAPTCHAs, often results in the easy of which the process can be automated. No CAPTCHA is perfect, and any CAPTCHA is subject to a great deal of attacks, what can on the other hand render someone’s ambitions for automatic recognition is figuring out how to break out of the current CAPTCHA model. And if CAPTCHA recognition is to be undermined on a large scale, such novel and adaptive approaches should be considered like the following replacements for text based CAPTCHAs :
- IMAGINATION: Image-based Authentication
- Animated GIF CAPTCHA
- KittenAuth
- ESP-PIX - The CAPTCHA Project
- Asirra
Watch out for another upcoming research courtesy of the same researchers, this time demonstrating Low-cost automated attacks on Yahoo CAPTCHAs, and don’t forget that just like humans committing click fraud next to botnets, human CAPTCHA breakers can recognize every CAPTCHA, however, it’s important they they remain unable to automate the process, which pretty much represents the current situation.