Researchers Teach Computers To Recognize Objects In Images

Some 60,000 photos were manually tagged with keywords by Penn State computer scientists to associate keywords with the pixel patterns that depict the objects in images.

By Thomas Claburn, InformationWeek
Oct. 9, 2008
URL: http://www.informationweek.com/story/showArticle.jhtml?articleID=210800744

It's said a picture is worth a thousand words, but future image searches may deliver more relevant results because computers are being taught to recognize objects in photographs.

In contrast to image-retrieval methodologies that rely on analysis of text associated with images, Automatic Linguistic Indexing of Pictures in Real Time (ALIPR) translates human-generated content identification into a statistical model that can thereafter recognize objects in images automatically.

The technique was developed Jia Li, an associate professor of statistics at Penn State, and James Wang, a Penn State associate professor of information sciences and technology. The pair hopes their approach can be used to automatically tag images and improve image search engines.

For their algorithm to work, Li and Wang first had to teach it to associate keywords with the pixel patterns that depict the objects in images. They took 60,000 photos and manually tagged them with keywords that described the objects represented.

Li said ALIPR associates at least one keyword out of a possible seven about 90% of the time, but notes that accuracy can depend on the searcher's expectations. ALIPR, for example, may sort people from animals, as expected, but may not differentiate between adults and children. She doesn't believe the technique will ever be 100% accurate. But in conjunction with other image-recognition techniques, it should improve image search relevancy.

The researchers want the Internet community to help make ALIPR smarter. The public is invited to visit the ALIPR Web site to upload photos and evaluate the how the site's images are categorized and tagged.

The ALIPR technology has been put to use in a test application called Story Picturing Engine, which generates a photo storyboard to illustrate any user-submitted story.

Related image analysis technology is also being tested as a way to deliver image-based CAPTCHAs.

Text-based CAPTCHAs ask the user to type a blurred or distorted set of letters as a way to determine whether the user is a human or a machine. But the machines have gotten smart enough, through advances in text-recognition techniques, to beat many CAPTCHA systems. This allows spammers to bulk register free e-mail accounts for spamming, among other things.

The image-based CAPTCHA system, called Imagination, asks the user to identify the geometric center of an object within an image. Developed by Ritendra Datta under the guidance of Wang and Li, Imagination presents a harder AI problem than text-based CAPTCHAs, at least at present.