November 6, 2006
– Researchers at Penn State University have developed a tagging
algorithm to add text to photos automatically. Called the Automatic
Linguistic Indexing of Pictures – Real Time system, ALIPR for short,
the picture auto tagging is available at www.alipr.com, was launched
last week.
ALIPR (pronounced a-lip-er), is based on a statistical algorithm that
annotates images automatically and quickly, adding text to online
images based on pictorial content. Taking only half a second to
process, ALIPR tags an image with 15 words out of a possible 322-word
vocabulary, while allowing users to add their own descriptions.
ALIPR, with further versions planned, has possibilities for online
users. Since the application of metadata tags is a time-consuming
process for photographers, ALIPR could potentially eliminate the need
to physically add keywords to entire collections. ALIPR could allow
users to share their tagged photos in a database and could even be
further used in biomedical data collection.
ALIPR has a decade of research behind it. During their college days at
Stanford University in 1995 to 1999, ALIPR developers James Wang and
Jia Li were classmates with database developers of what would
eventually become Google and Yahoo!. “We were working on images, rather
than text like Google,” said Wang, now a Penn State Associate Professor
at the College of Information Sciences and Technology who co-authored
the abstract with Li, now a Penn State Associate Professor of
Statistics.
With millions of users who share their images online, most images are
not searchable unless the photographer enters a description attached to
their picture. “People don’t want to spend time to annotate their
pictures;” said Wang. ALIPR can increase Internet visibility of images,
stated the abstract published last week, essentially doing all the
heavy work.
“The ultimate goal is to annotate all the images on the Internet,” said Wang.
Since its launch just a few days ago, ALIPR.com has received thousands
of hits. “Overnight, we generated thousands of images,” said Wang. The
database contains pictures ranging from simple objects such as a cup to
images of the Dali Lama.

ALIPR also has an alpha version of a keyword search, like that of the
Google Images search engine. Enter the word “tiger” and surely enough,
images of the animal appear. The database also works with adjectives,
based on the user's description.
Unlike entering metadata attached to photographs, ALIPR automatically
attaches key words, based on previous success rates with other images.
The more users who apply text to their images through ALIPR, the more
accurate it will become.
“ALIPR is like a child trying to learn about the world,” states the
ALIPR website. Just as parents teach their children what a tiger is,
developers can create a mathematical model of what a tiger is to a
computer, according to Wang.
“We can train computers about hundreds of semantic concepts using example pictures from each concept,” stated the abstract.
More than half of the time, ALIPR correctly attaches keywords to an
image, and 98 percent of the time, ALIPR attaches at least one word
correctly. The accuracy rates were based on testing of more than 50,000
images from the Yahoo-owned photo-sharing site Flickr.com. Testers
manually evaluated 5400 of those images. They found an error rate of
only 2 percent in which ALIPR did not tag at least one keyword
correctly.
ALIPR, though, still faces some initial limitations. Because it is
based on evaluating color, texture, and space, ALIPR can only add
keywords to color images of real life. ALIPR is less accurate in
attaching text when users feed images of cartoons, art work, or black
and white images.
As far as possible applications of ALIPR go, the makers have been in
talks with an online photo site, stock photo agencies, and a museum to
help detect fake paintings.
Wang said that future ALIPR versions are coming, but timing depends on
developing better technology, funding, and users. “This technology is
based on community effort,” said Wang who tells users to not be
discouraged with ALIPR’s results of difficult images.
Users can test ALIPR out for themselves at www.alipr.com.