In a demo of QBIC which searches a database of the U.S.
Stamp collection, both the strengths and weaknesses of
the system can be seen. For instance, a search under key
words such as "animal" or "love," words for which there
exist specific groups of stamps, the results are on target.
Search "animal":
Search "love":
However, searches under key words which do not so easily
divide stamps provide rather undesirable results.
Search "rose":
These results appear due to the fact that the color of
the stamp is defined as "rose pink." Here, QBIC shows
the ultimate flaw in trying to interpret visual data in
terms of just words. The confusion between the visual
nature of the object and its content (which may contain
the same words, though they are unrelated) makes searching
for images that much more challenging.
Virage's Image Search Engine Library: Image (System)-Building
Virage is another pioneer of the image search engine,
though its methods are quite different from QBIC's. Virage's
categorization and analysis methods depend on a software-based
analysis of the images, as opposed to human indexes. Virage
categorizes images based on major descriptors such as
texture, color type and distribution, shape and structures.
Unfortunately, this visually-based analysis creates limitations
on the search due to its lack of regard for actual content.
The kind of analysis described, though stringent, fails
to interpret the object of an image. An example of this
would be seen if a user wanted to find a "similar image"
to a map of city streets. They would have just as many
chances of finding the layout for a computer chip as they
would another map of city streets. This and other undesired
results stem from the fact that many images may share
core visual properties, but actually depict completely
different objects. Virage and QBIC both show how important
both the content information and the visual information
are in accurately searching the visual web.
Excalibur's Visual Retrievalware: Neural-Logical
Image Management
The next level of searching methods involve processes
which more closely mirror the brain's interpretation of
images. Excalibur's content-based retrieval products do
not use traditional text retrieval methods in their searches,
not do they use stringent visual criteria. Their neural
network technology uses a method they refer to as Adaptive
Pattern Recognition Processing (APRP). These systems "develop
an increasingly rich essential notion of objects by analyzing
many instances of these objects at different angles or
renditions." This quote seems to suggest that there is
a deeper analysis of the object than its mere physical
qualities. APRP attempts to analyze patterns in both text
and images thereby bypassing many of the minor details
that throw off traditional text-based searches such as
wrong spelling. APRP is modeled on the way bilogical systems
use neural networks to process information. This design
requires a proper sample set to calibrate the neural network
in the beginning, so that it may become familiar with
what kind of patterns to look for. Therefore, its major
drawback is the dependency of its success on a proper
sample set to begin with, since a poorly chosen sample
set would greatly reduce its success in future searches.
Though these problem cannot be immediately solved, the
software's attempt to integrate knowledge from both surrounding
text and images speaks to the bigger picture of the nature
of the visual web.
NEC's Advanced Multimedia- Oriented Retrieval Engine
(Amore)
In a similar vein, NEC's attempt at an image retrieval
system also borrows from human methods of comprehension.
NEC's image retrieval system, though a lesser known image-based
search engine, has received high remarks from the Getty
Information Institute, in the center's research to "improve
users' attempt to find cultural material on the Web."
Marty Harris of the institute's Technology, Research and
Development Group remarked " 'What set Amore apart was
the algorithms: when you used it you got results back
that felt good, not unsettling as with some other tools.'
" Amore uses content-oriented image retrieval (COIR) which
essentially, according to Yoshi Hara, Amore's product
manager, " 'recognizes objects within an image and compares
those objects for similarity in the same way that humans
perceive and translate images into meaningful data.'"
It mostly supports color and shape feature extraction,
so these limited descriptors could perhaps allow it more
depth of analysis and comparison. Though limited, Amore
seems to follow in today's trend to interpret and categorize
images using more advanced methods of analysis and pattern-building
in order to better harness the visual web.
Conclusion
Overall, the visual web has created many challenges
for search engines. The technology is by no means refined:
searching for "apple" on Google's image search has just
as many chances of finding pictures of New York City (the
Big Apple) and Apple computers as it does of finding images
of the fruit itself. However, this could all be expected
for even text retrieval has its fair share of random results.
As the web becomes more visual, it is becoming increasingly
urgent to devise ways of searching these images and video
clips in an accurate way, so that users will be able to
find what they're looking for.