
June 2001 From Penn State New digital image retrieval approach reduces search time, improves accuracyUniversity
Park, Pa. -- A new approach to automatically sorting, classifying and
retrieving digital images -- based on the way people look at and
understand pictures -- promises not only faster, more accurate image
database searches but also better Web searches too, says a Penn State
researcher. Dr. James Z. Wang, Penn State assistant professor of
information sciences and technology who holds the PNC Technologies
Career Development Professorship, developed the approach when he was a
graduate student at Stanford University. He says the new approach has
potential for application in biomedicine, crime prevention, the
military, commerce, education, entertainment and Web image
classification. The new approach does not consider any
information other than the image itself. Just as a person, shown a
picture of a horse, can extract the features characteristic of horses
and then identify other pictures that contain horses, so does the new
computer-based approach. The new system retrieves relevant images from
an image database or the Web on the basis of automatically derived
image features or content. Image retrieval techniques currently
in commercial use mostly rely on keywords or descriptions. While this
text-based approach can be accurate and efficient for limited databases
of high value, for example, museum pictures, it can become
prohibitively expensive to input, manually, descriptions of large-scale
image databases such as astronomical observations. The new approach not
only reduces the need for textual information but also can handle,
quickly and efficiently, the approximately one billion images that can
be found on the Internet. Wang and colleagues have built an
experimental image retrieval system, called SIMPLIcity, to validate and
demonstrate their methods and have tested it on a database of about
200,000 general-purpose images and an archive of more than 70,000
pathology images. The SIMPLIcity approach performs better and faster
than existing methods and can also be applied to the classification of
on-line images and web sites. (To view a demonstration of SIMPLIcity go
to http://wang.ist.psu.edu/IMAGE)
Using the same approach they have also developed an image filtering
system, called WIPE, that parents can use to protect their children
from pornography on the Web. WIPE identifies and blocks objectionable
images. WIPE takes only one second per picture versus other filters
that require minutes and has an accuracy over 90 percent. (A
demonstration of WIPE is also at http://wang.ist.psu.edu/IMAGE)
Wang has detailed his content-based image retrieval (CBIR) approach in
a new book, Integrated Region-Based Image Retrieval, published this
month (June) by Kluwer Academic Publishers. The book details the design
and implementation of the new content-based retrieval system and its
application to a general picture library and a biomedical image
database. Wang notes that the capability of existing CBIR systems
is essentially limited by the fact that they rely on only primitive
features of the image. In his new approach, Wang matches the image
features selected to classify the image to the type of picture. For
example, a color layout indexing method may be best for outdoor
pictures while a region-based indexing approach may be better for
indoor pictures. The biomedical image database can be categorized into
X-ray, MRI, pathology, graphs, micro-arrays and other features specific
to the types of images in the collection. For general-purpose
image libraries and the Web, Wang has classified images into textured
vs. non-textured, graph vs. photograph. His approach represents the
first time that categories, such as textured vs. non-textured, have
been used as a distinguishing feature in image retrieval. In addition,
besides using new image features as classification tools, SIMPLIcity
uses a similarity measure based on information about the entire image
rather than representative segments. In traditional approaches,
computer programs may segment one image of a dog, for example, into two
regions: the dog and the background. The same program may segment
another image of a dog into six regions: the dog's body, the dog's
front legs, the dog's rear legs, the dog's eyes, the background and the
sky. The inconsistent segmentation makes it harder to make a match. In
SIMPLIcity, an overall "soft similarity" approach reduces the influence
of inaccurate segmentation. The most similar region pairs are matched
first and then the matching process is "softened" by allowing one
region of an image to be matched to several regions of another image.
In this way all of the regions of the images are taken into
consideration. "SIMPLIcity is robust to intensity variation,
sharpness variations, color distortions, other distortions, cropping,
scaling, shifting and rotation,"says Wang. "The system is also easier
to use than other region-based retrieval systems." The
work was supported primarily by a research grant from the National
Science Foundation's Digital Libraries Initiative and a research fund
from the Stanford University Libraries. Additional support came from
IBM Almaden Research Center, NEC Research lab, SRI International,
Stanford Computer Science Department, Stanford Mathematics Department,
Stanford Biomedical Informatics, The Pennsylvania State University and
PNC Foundation. The work is on going at Penn State. EDITORS: Dr. Wang is at 814-865-7889 or jwang@ist.psu.edu
|
| Mad About Maps |  | Museum-quality antique map reproductions on handmade tables. From Old Town Los Angeles to Middle Earth. | Read more... |
| UCLA Experiments |  | We want YOU to be a part of social science research! Click here to participate in online experiments and surveys. | Read more... |
| Advertise here |
|