Home » Software

CAPTCHAs – More Than Irritating Curvy, Wavy Text

Submitted by admin on March 31, 2011 – 2:41 amNo Comment

You encounter them on lots of web sites – even on Google.com. The wavy letters with lines around and through them, some punctuation, dots, and mixed up lower and uppercase letters. You look at the box, maybe turn your monitor sideways, or stand up and spin around a few times, and then try to decipher the image you see. There’s a little box for you to type in – and if you get it right, you pass into the realm of humanity. If you get it wrong, some sites call you names like ‘bot’, ’spambot’, or just a plain ’spammer’ and deny access until you become more human. I’m describing the ubiquitous CAPTCHA – a test that assures the web site’s owner that you are as human as it gets.

While it seems like a reasonable ‘test’ for humanness – computers supposedly have a hard time deciphering CAPTCHAs – they can be a headache. I routinely get them wrong and have called my computer a few choice names as a result. I always wondered what is behind a CAPTCHA: are the words that the ‘test’ shows you random words? Are computers really not able to solve CAPTCHAs? What’s the problem anyway – why prevent a computer, or bot, from using a site anyway? My research turned up some surprising results – the most shocking: you’re doing someone else’s work when you solve a CAPTCHA!

The CAPTCHA test is a means of ensuring that a visitor to a site is a real person. For example, a site that spends money to provide its users real-time stock quotes wants to ensure that their expensive real-time quotes are not being copied by a computer for use on some other site. Another example could be searching for ticket prices: a site that sells tickets for events like concerts wants to ensure that someone is not using an automated process to buy all available tickets to then sell to others at a higher price. CAPTCHA tests help in both cases to ensure that web site visitors are real people and not automated processes.

The word CAPTCHA is an acronym for the phrase Completely Automated Public Turing test to tell Computers and Humans Apart. The test is named after Alan Turing who is among the most influential people in the field of computer science. Among his many accomplishments, Alan Turing developed the Turing Test in 1950 to test a machine’s ability to demonstrate intelligence. A Turing Test is administered by a human to test a machine’s ability to demonstrate intelligence. The test a CAPTCHA administers is a reverse-Turing Test: a computer administers a test to a human to demonstrate intelligence.

It might come as a surprise that the words you see are from old books, magazines, newspapers, and other printed materials that are being digitized to transform them into accurate and searchable text files. Computers handle text files flawlessly, making them perfect for helping to preserve the wealth of information humanity discovered, recorded, and developed long before computers and the internet.

The process starts once printed material gets scanned. Scanned pages become graphic files and contain all of the marks, crooked text, smudges, and other imperfections on the page when it was scanned. The next step is to encode the text on the image into text using a technique called Optical Character Recognition, or OCR.

OCR is a mature technology yet it is between 70% to 90% accurate. A system uses two different OCR applications to scan the same page. Both OCR applications make mistakes when they scan a page, yet they each make different mistakes. The system checks the deciphered words against dictionaries and flags words that are deciphered differently both OCR applications. Another application attempts to decipher flagged words by examining words before and after them to make and educated guess. Each flagged word becomes part of a CAPTCHA.

The CAPTCHA that you see when you attempt to sign up for a new email address, or buy a ticket for a basketball game, is not from the originally scanned image. A CAPTCHA usually contains two words. One of the words’ in the test is already known and the other is not. The word that is known serves as the control, meaning that if you decipher the control word, chances are very high that you are indeed an intelligent human. The solution you provide for the other word is compared with the results of the OCR scans and the educated guess that the computer system provided earlier. Once the system is satisfied that you correctly solved the test, you get to continue to buy your ticket or get that new email address, and the word you solved now forms part of the vast archive of knowledge that are being digitally preserved.

In some cases, people consistently provide the wrong solution for the unknown word because the original text is badly damaged in some way. Words like this are deemed undecipherable by the system and passed on for further analysis by experts in the field. Fortunately, only a very small percentage of words are deemed undecipherable.

The accuracy of the whole system is over 99 percent and it is estimated that people around the world could decode at least 200 million CAPTCHAs each day. Since there’s a lot of printed material out there, these tests will be around for a long time.

While the whole process works well, there are legitimate reasons for circumventing the test for intelligence. A researcher, Jonathan Wilkins, developed a process that’s capable of deciphering text at a success rate of about 17.5%. While 17.5% success may sound low, it is higher than the previous zero percent. The researcher developed the technique in late 2010 and many services have sprung up based on his findings. People that have a need to automatically solve CAPTCHAs can pay for a service that solves them at a rate approaching 99% accuracy. The services use a combination of computers and people to provide solutions.

Microsoft takes a different approach. Microsoft created a system called ASIRRA – an acronym for Animal Species Image Recognition for Restricting Access. The system presents users with 12 images of cats and dogs and asks users to select images of only cats or dogs. The images come from an archive of over three million images of cats and dogs from Petfinder.com – a service that reunites lost pets and their owners. In theory, images of cats and dogs are more difficult to decipher than text, making it much harder to circumvent the system. Of course, where there is a cat, there is a mouse and the hunt to get better is always on!

About The Author:

Erik Westermann is an article marketing expert that boosts web sites’ search engine ranking with original, well-researched articles, press releases and blog posts.

Leave a comment!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.


Architecture »

Want Imperia 3 BHK apartments Dosti Group THANE MUMBAI

Affinity Solution Pvt. Ltd.
Presents
Dosti Imperia Thane Mumbai:
Dosti Group Imperia Thane ||91-9999684166|| Dosti Group Imperia Project Thane | Dosti Group Imperia Property Thane | Dosti Group Imperia Propeties Thane | Dosti Group Imperia Apartment Thane | …

Hi-Tech »

IT »

Benefits of Online Video on Demand Services for Individuals

Are you tired of waiting for your favorite programs or movies? On the other hand, do you have to wait restlessly in front of the idiot box and have to watch commercials in between your …

Medicine »

Bring your natural smile back by consulting Burbank cosmetic dentist

A beautiful and charming smile is all you need when you speak to somebody. Your smile can attract anyone and make an impressive impact to the listener. And a simple problem in your teeth and …

Press Releases »

Vashi Navi Mumbai New Booking Properties-09999684166 Sparsh Projects By Arihant Vashi Mumbai

Affinity Solution Pvt. Ltd.
Presents
Arihant Sparsh Vashi Navi Mumbai:
Arihant Sparsh | Arihant Sparsh Vashi | Arihant Sparsh Vashi Rates | Arihant Sparsh Vashi Navi Mumbai | Arihant Sparsh Vashi Location | Arihant Sparsh Project Vashi | …

Science »

Culinary Management Programs

There are many different culinary management programs throughout the country at various schools, but the most common type is the Bachelor of Culinary Management degree, which can be found at colleges, universities, technical institutions and …