NLP - techniques

Techniques

intro | multiple meanings | word prediction | part of speech tagging | context

Introduction

Remember all those times when you had to write an essay in Spanish or French and you were tempted to just write it in English and copy and paste it into BabelFish? And then, just to see, you tried a sentence and it came out completely garbled? In case you don’t, here’s an example.

English: We like natural language processing because it makes us intellectually excited.
French : Nous aimons le langage naturel traitant parce qu'il nous fait intellectuellement passionnants.
English again: We like the natural language treating parce qu'il makes us intellectually enthralling.

It may not be obvious at first why, at least at this point, getting a computer to understand language in a manner even marginally similar to the way a human does should be such a difficult task. In theory, after all, language is just a large amount of data that follows a set of rules. The rules are extensive, of course, and the data is abundant, but it seems that computers are tailored exactly to the task of performing complex manipulations on masses of data. What, then, is the problem?

This is the first topic that Martin Kay, Jean Mark Gawron, and Peter Norvig address in their book Verbmobil, which describes an attempt to make a portable language translation machine. They say that there are two problems:

1. The meaning that a word, a phrase, or a sentence conveys is determined not just by itself, but by other parts of the text, both preceding and following.
2. The meaning of a text as a whole is not determined by the words, phrases, and sentences that make it up, but by the situation in which it is used (Kay 13).