Multiple
Meanings
This
is complicated. The more you think about it, the more complicated
it becomes. The authors of Verbmobil take the second of these two
and examine it with regard to the word “open.”
How many different things can we mean with just this one word? Offhand,
the authors of Verbmobil can find about seven meanings. Here they
are: open door, open golf tournament, open question, open eyes,
open job opportunity, open morning, and open football player.
Verbmobil continues:
“...
note that while we say ‘his eyes are open’ when they
are uncovered and ready to receive stimulus, we do not say ‘his
nose is open’ when he is ready to smell, nor ‘his
feet are open’ when his socks are removed. We speak of ‘empty
glasses’ and ‘free-range chickens’ when “open”
might just as well have been used (Kay 15).
And
all of this complication results from just one word! Think about
what happens when an ambiguous word such as “open” is
combined with another word that carries a variety of meanings. Then
think about putting that phrase in a sentence, described by the
context of other ambiguous phrases. Then think about what happens
when things like idioms are thrown in. One infamous example of an
exponentially indecipherable sentence is “Time flies like
an arrow.” What this means is fairly obvious to a human. However,
for a computer this sentence will raise quite a bit of confusion.
Is “time,” in fact, a noun, or are “time flies”
a special kind of insect that prefers arrows?
Many
English words have multiple meanings, something that humans generally
cope with via context and knowledge of the world. However, natural
language processing systems must use other methods to interpret
the correct sense, or meaning, of words in order to understand sentences.
Supervised
Methods of Disambiguation
Whenever a source of knowledge is used in learning, such as a dictionary
or human intervention, that learning is known as supervised. A number
of supervised methods for disambiguating words are available, and
they tend to be more accurate than unsupervised methods; to give
a taste of these supervised methods, this introduction will focus
on the Naives Bayes approach and dictionary disambiguation.
The
Naives Bayes approach uses surrounding words to disambiguate the
particular sense in which a word is being used. Instead of focusing
on order of words, as n-gram models do, Naive Bayes treats the surrounding
words as unordered and looks at which words commonly occur around
the target word. If such words fall into a few distinct clusters,
it is likely that each cluster corresponds to a particular sense
of the word. This technique is considered supervised because it
requires a labeled training corpus; the material on which it is
trained must classify each word as corresponding to a particular
sense. To understand how this method works, one can examine a word
like “plane.” Assuming there are only two senses of
plane, the training corpus might mark the sense of plane as a transportation
device as the first sense and the sense of plane as a geometrical
idea as the second sense. Then the model would process the training
corpus, keeping track of words that often occur near “plane.”
Next, the results are tabulated as below:
first sense
(transportation): flying, airport, flight, time, departure, arrival,
security...
second sense (geometry): two-dimensional, coordinate, math, angle,
algebra, graphing...
When
the Naive Bayes model is tested and encounters the word “plane,”
it then checks if the words around this occurrence of “plane”
most closely match the first sense or the second sense to determine
which sense is more probable.
Another
type of supervised learning can be done using algorithms that employ
a dictionary for help with disambiguation. These algorithms can
use an unmarked training corpus, unlike the Naive Bayes approach,
and they use the dictionary as the source of all the senses of a
word. Like Naive Bayes, however, these approaches are concerned
with what words are close to the target word. The models then check
if any of the words that are nearby appear in one of the sense definitions
for the target word. Based on these matches between context and
sense definition, the model calculates the probability that the
word is an instance of each of the senses. The sense with the highest
probability becomes the model’s guess for the meaning of the
word.
The similarity of this dictionary approach to the Naive Bayes model
prompts the question of why one would use one method over the other.
Each model has trade-offs. Naive Bayes requires a labeled training
corpus, something that takes a great deal of effort for a human
to produce and is subjective based upon how the particular human
sees the senses, while many digital versions of dictionaries exist
that can be easily employed in the dictionary algorithms. However,
the Naive Bayes approach can have significantly more words correlated
to specific senses than the dictionary approach, increasing accuracy
levels for Naive Bayes. One approach to boosting the accuracy of
dictionary algorithms (in order to avoid the need for a labeled
training corpus) is to add a thesaurus to the tools that the algorithm
can use. With this tool, the model can use words close to the target
word that are synonyms of words in the target word’s definition
to provide more data for disambiguation. The strengths of Naive
Bayes and dictionary disambiguation can thus be combined to some
extent.
|