Penn State Researchers Create Auto Picture Tagging

November 6, 2006 – Researchers at Penn State University have developed a tagging algorithm to add text to photos automatically. Called the Automatic Linguistic Indexing of Pictures – Real Time system, ALIPR for short, the picture auto tagging is available at www.alipr.com, was launched last week.

ALIPR (pronounced a-lip-er), is based on a statistical algorithm that annotates images automatically and quickly, adding text to online images based on pictorial content. Taking only half a second to process, ALIPR tags an image with 15 words out of a possible 322-word vocabulary, while allowing users to add their own descriptions.

ALIPR, with further versions planned, has possibilities for online users. Since the application of metadata tags is a time-consuming process for photographers, ALIPR could potentially eliminate the need to physically add keywords to entire collections. ALIPR could allow users to share their tagged photos in a database and could even be further used in biomedical data collection.

ALIPR has a decade of research behind it. During their college days at Stanford University in 1995 to 1999, ALIPR developers James Wang and Jia Li were classmates with database developers of what would eventually become Google and Yahoo!. “We were working on images, rather than text like Google,” said Wang, now a Penn State Associate Professor at the College of Information Sciences and Technology who co-authored the abstract with Li, now a Penn State Associate Professor of Statistics.

With millions of users who share their images online, most images are not searchable unless the photographer enters a description attached to their picture. “People don’t want to spend time to annotate their pictures;” said Wang. ALIPR can increase Internet visibility of images, stated the abstract published last week, essentially doing all the heavy work.

“The ultimate goal is to annotate all the images on the Internet,” said Wang.

Since its launch just a few days ago, ALIPR.com has received thousands of hits. “Overnight, we generated thousands of images,” said Wang. The database contains pictures ranging from simple objects such as a cup to images of the Dali Lama.

ALIPR also has an alpha version of a keyword search, like that of the Google Images search engine. Enter the word “tiger” and surely enough, images of the animal appear. The database also works with adjectives, based on the user's description.

Unlike entering metadata attached to photographs, ALIPR automatically attaches key words, based on previous success rates with other images. The more users who apply text to their images through ALIPR, the more accurate it will become.

“ALIPR is like a child trying to learn about the world,” states the ALIPR website. Just as parents teach their children what a tiger is, developers can create a mathematical model of what a tiger is to a computer, according to Wang.

“We can train computers about hundreds of semantic concepts using example pictures from each concept,” stated the abstract.

More than half of the time, ALIPR correctly attaches keywords to an image, and 98 percent of the time, ALIPR attaches at least one word correctly. The accuracy rates were based on testing of more than 50,000 images from the Yahoo-owned photo-sharing site Flickr.com. Testers manually evaluated 5400 of those images. They found an error rate of only 2 percent in which ALIPR did not tag at least one keyword correctly.

ALIPR, though, still faces some initial limitations. Because it is based on evaluating color, texture, and space, ALIPR can only add keywords to color images of real life. ALIPR is less accurate in attaching text when users feed images of cartoons, art work, or black and white images.

As far as possible applications of ALIPR go, the makers have been in talks with an online photo site, stock photo agencies, and a museum to help detect fake paintings.

Wang said that future ALIPR versions are coming, but timing depends on developing better technology, funding, and users. “This technology is based on community effort,” said Wang who tells users to not be discouraged with ALIPR’s results of difficult images.

Users can test ALIPR out for themselves at www.alipr.com.

Penn State Researchers Create Auto Picture Tagging

Comments