If you've signed up for a website in the last year or two, you're likely familiar with CAPTCHAs, those distorted images asking you to figure out some gibberish string of numbers and letters. A CAPTCHA is intended to stop abuse of a system by automated software by offering a task that only people can solve. We've often been asked by clients to "put a CAPTCHA" where we can anticipate abuse, but I've always pushed back as the effectiveness of CAPTCHAs has degraded over time. Here, we'll take a look at the problems with captchas, and suggest some alternatives to their use.
CAPTCHAs hurt usability and accessibility.
A visual CAPTCHA will not be usable by visitors using screen readers, or who suffer some vision impairment such as color blindness. An accompanying audio CAPTCHA is recommended, but now you've doubled opportunities for nefarious users to attack your web site. Even if you have good vision, you've probably encountered the visual CAPTCHA that are difficult to use, since making them hard to read is the only way to make them effective. By making them hard to read, you've made your web page much harder to use. I've run into CAPTCHA that take me a number of tries to get right because its hard to tell the ones apart from the Ls or zero's from the letter O.
CAPTCHAs have already been broken
CAPTCHAs have already been cracked through various methods. Automated programs exist to break common CAPTCHAs, and you can actually buy such software.. Jeff Atwood asked last November Has CAPTCHA Been "Broken"?, and argued that CAPTCHAs were still effective since Google, Hotmail, and Yahoo were considered unbreakable. For now let's ignore the fact that you need the resources of Google, Hotmail, or Yahoo to make "unbreakable" CAPTCHAs. Recent reports suggest that even their systems have been broken - Software Attacks Software in Security Wars.
Image CAPTCHAs for Google, Windows Live, and Yahoo! have been broken in recent months, and is believed to account for the increasing levels of spam that are coming from webmail services that those companies provide.
Security Labs even managed to dissect exactly how spammers have automated setting up Microsoft Hotmail account: Microsoft Live Hotmail Under Attack by Streamlined Anti-CAPTCHA and Mass-mailing Operations.
It is observed that unlike Live Mail Anti-CAPTCHA and Gmail Anti-CAPTCHA operations in the past, the current attack is aggressive and instantaneous in terms of CAPTCHA breaking host turn-around time.
Automated solutions are not required though, as CAPTCHAs can be solved by relaying the image to unsuspecting users through a relay attack. Just last year, a striptease program was used to bypass Yahoo's CAPTCHAs.
Trend Micro has identified the program as TROJ_CAPTCHAR.A, a striptease game wherein the player enters the letters hiding within a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) image. For each correct entry, more clothes come off in photos of a scantily clad woman identified as "Melissa."
You can't rely on CAPTCHAs
If this technique is relatively useless, at best it'll just slow down malicious users instead of stopping them altogether, what alternatives do we have? I'll consider two scenarios where CAPTCHAs are commonly used - deterring spam on blogs and message boards, and preventing automated registration for user accounts.
Alternatives for limiting spam messages
If you have a blog or run a message board, spammers are a nuisance that drown out legit conversations with noise. One of the best solutions I've used for limiting spam is Akismet, a distributed and collaborative effort to identify spam messages. Its a web service that you must sign up for - although free for personal use, you'll need a subscription for non-personal uses. Basically, when a visitor leaves a message, your CMS or blog first sends the message, along with some information about who posted it, to Akismet which returns a simple yes or no result about the message's spammy-ness. At that point, you can either reject the message altogether or hold it for further review and approval. Akismet integrates easily with Wordpress, and their are libraries and plug-ins for many other platforms. If one doesn't exist, the Akismet API is open and documented so you can write your own.
If you are using PHP, and don't want to integrate with the Akismet web service, or simply want another line of defense, there is Bad Behavior. It uses a number of tests to try to screen out spam bots from your site before they can do any damage.
Bad Behavior runs before your software on each request to your Web site, so if a spam bot does visit, it will receive nothing, and your software never runs. This reduces the amount of server CPU time, database activity and bandwidth spent on processing robots which are just harvesting your site and delivering junk.
A third method for fighting comment spam is to require unknown users to confirm their message via email. That is, ask for an email address along with a comment - this is fairly standard already - and for unregistered users send them an email with a link for them to confirm their message. For regular visitors, you can ask them to create an account or, better still, use OpenID to confirm their identify, and allow them to skip the email confirmation step altogether. As an added precaution, you may want to review postings from new users until they reach a milestone like "5 non-spam messages".
Alternatives for protecting user registrations from bots.
Using CAPTCHAs as part of the registration process is meant to separate people from bots. Digg even asks the question Are you human? Technological alternatives here are a little less obvious. You could require users to activate their account via email, which at least makes it more time consuming for potentially malicious users to register. Depending on the sensitivity of the application, you can require even more difficult activation procedures. I know one credit card company system requires providing a phone number to call you with an activation code. Another alternative is to require invitations to join a system coupled with a way to audit invitations in case someone invites a bad apple. An overall approach that should work is to give users gradually escalating privileges as they demonstrate good behavior.
I'm not sure that a single technical cure exists for preventing unwanted user registrations. For now, I think sites will need to rely on an approval process of some kind for new registrations and a method for other site users to report people who abuse the system.