Pittsburgh-based reCAPTCHA has been acquired by Google for an undisclosed amount. (I’m trying to see if I can dig up the numbers).
reCAPTCHA, which started as a project of the School of Computer Science at Carnegie Mellon, is a free CAPTCHA service that helps to digitize books, newspapers and old time radio shows. CAPTCHAs are images with letters that humans can (usually) recognize but computers cannot – and thus are used to test whether a real human is filling out the form. Rather than using randomly generated images, however, reCAPTCHA shows words from these books and newspapers that could not be recognized by a computer, effectively building a crowdsourced OCR system. So, when you’re buying tickets from, say, Ticketmaster (one of about 100,000 sites using reCAPTCHA), you’re helping a computer better understand that text.
This technology and approach will be used in other Google products such as Google Books and Google News Archive.
Having the text version of documents is important because plain text can be searched, easily rendered on mobile devices and displayed to visually impaired users. So we'll be applying the technology within Google not only to increase fraud and spam protection for Google products but also to improve our books and newspaper scanning process.
Congratulations to the reCAPTCHA team!