AI - History

Plato (428 - 348 B.C.)
Plato is the central figure in that triumvirate of Greek philosophers that we readily associate with classical era of philosophy, Socrates, Plato and Aristotle. Plato preached the doctrine of ideal forms, thereby creating a branch of thought called Platonism that has since shaped the worldview of many mathematicians. In this doctrine, the world can be divided into two – an independent world of ideas, and a dependent world of reality, which apes this ideal world. Each entity in the real world is mutable, open to corruption and change; for instance, the cup from which you drink your morning coffee will not last forever, but will eventually crumble and be destroyed. However, such real objects have ideal counterparts in the ideal world; here, the idea of ‘cup’ is immutable and constant (it is helpful here of ‘ideal’ as the adjectival form of ‘idea’). One recognizes real instances of cups (and other objects) by reference to their ideal form, which in effect acts as a template for ‘cupness’. Plato thus distinguishes between the idea and its realization, creating a philosophical dichotomy that persists today in the distinction between type and token, data type and data instance, and class and instance. These ideas underlie our classical understanding of concept abstraction, and in no small way then anticipate the ideas that form the basis of object-oriented programming!

Aristotle (384 - 322 B.C. )
Aristotle is perhaps best remembered however for his contribution to logic, and his attempts to formalize rational thought via his system of syllogistic reasoning. A syllogism is a template for deductive inference that comprises two different premises and a conclusion, each of which is categorical in the sense that each contain a subject class (e.g., cup) and a predicate class (e.g., fragile) that are related via a statement of identity (e.g., ‘some cups are fragile’). Probably the most hackneyed syllogism of all is the following:
Every Man is Mortal. (First premise)
Socrates is a Man. (second premise)
Therefore, Socrates is Mortal. (Conclusion)
Aristotle’s syllogistic reasoning laid the groundwork for subsequent models of logic (as developed by the Swiss mathematician Leonhard Euler, the English mathematicians Charles Dodgson (Lewis Carroll) and George Boole).

abu-Ja’far Mohammed ibn-Musa al-Khuwarizmi (Ninth Century A.D. )
al-Khuwarizmi was a Persian mathematician whose work (and name) has given Computer Science the term ‘algorithm’. Published in 830 A.D., al-Khuwarizmi wrote a highly influential book on mathematical methods which later translated into Latin as ‘Algoritmi de numero Indium’ for Western consumption in the twelfth century.

René Descartes (1596-1650)
Descartes was a French rationalist philosopher and mathematician is regarded by many as the founder of modern analytic philosophy. The influential brand of rationalism pursued by Descartes can be summed up as follows: starting from an initial empirical basis of some key observations, one can use this basis as a springboard to jump into, and theorize about, the purely abstract. In this realm of the abstract, the validity of a rationalist idea is not so much to be judged by its agreement with empirical observation, but by the idea’s own intrinsic elegance and mathematical clarity. This brand of Cartesian rationalism has most successfully been adopted in this century by the influential linguist Noam Chomsky, whose own work departs so obviously from the empirical that he has taken to labeling himself a “Cartesian linguist”.

Gottfried Wilhelm von Leibniz (1646-1716)
Leibniz was a German rationalist philosopher and mathematician, who viewed the universe as a hierarchical system of indivisible and independent units, or monads, which acted in synchrony according to a pre-established harmony. He is worth mentioning here for his views on the formalization of thought, claiming as he did that any argument could be placed into a mathematical framework (or LOT, Language of Thought) from which its validity could then be determined. By use of this proposed LOT, which interestingly exploited prime numbers in a similar fashion to that used by Gödel almost three hundred years later, Leibniz hoped philosophers could replace the attitude of ‘let us argue’ with the more constructive approach of ‘let us calculate’. Though Leibniz’s ambitions in this area of ‘rationality via computation’ did not extend further than the mechanical calculator he invented in 1673, his work nevertheless provides a touchstone for the computational philosophy of A.I.

Immanuel Kant (1724-1804)
Kant was a German idealist philosopher, who sought to determine the limits and nature of man’s knowledge in his Critique of Pure Reason (published in 1781). Such inquiries into the status of knowledge, and how it is we know what we know, generally fall under the rubric of ‘Epistemology’ in philosophy. Furthermore, Kant is arguably the first philosopher to make explicit the notion of an internal mental representation, which he termed a ‘schema’. In his Critique, he proposed a range of such mental schemas, which he claimed were innate to human thought, providing the conceptual spectacles through which we all view the world. This idea of mental schemata is now very prevalent in modern thinking on the mind, especially so in the field of Cognitive Science.

Ada Lovelace (1815-1852)
She is remembered for writing the first computer program, though the device for which she programmed did not exist in her time, due to the overly ambitious nature of its engineering requirements. Less auspiciously for A.I., her most famous quote concerns the nature of computational creativity, claiming that the “Analytical Engine has no pretensions to originating anything, rather it is the programmer that is the source of any mechanical insight.” The field of A.I. is predicated on this prediction being wrong.

Gottlob Frege (1848-1925)
Frege was a German logician, mathematician and philosopher who laid the foundations of modern logic. His most seminal work, ‘Begriffsschrift’ (or ‘Conceptual Writing’, published in 1879), introduced a notational variant of what we now term First-Order Logic, and introduced the semantic distinction between sense and reference in the construction of meaning. The sense of a natural language sentence is the logical proposition underlying the sentence, whereas the reference of such a sentence is its intrinsic ‘truth value’ (i.e., does it express a truth or a falsehood?) In contrast, the sense of an entity—such as ‘cup’—is the mental realization one has of the entity, while its reference is its actual realization in the physical world. While you and I may disagree over our sense of the same cup, (I say ‘tea cup’ and you say ‘coffee mug’, for instance), we may well be referring to the same cup!

Bertrand Russell (1872-1970) and Alfred North Whitehead (1861-1947)
Russell was an English philosopher and mathematician, who is perhaps best remembered in an A.I. context as the co-author of Principia Mathematica (1910) with fellow English mathematician, Alfred North Whitehead. The Principia was an attempt to follow in the research vein identified by Hilbert and place mathematics within a logical framework that would allow mathematical truths to be sought out by a principled proof procedure. In attempting to find a class-based logical definition of “number”, Russell and Whitehead also ran up against a paradox in the work of Frege regarding his conception of universal classes. Known as the Russell Paradox, it states that the class (i.e., set) of all classes that are not members of themselves is a member of itself only if it is not a member of itself, and is not a member only if it is. This paradox is perhaps better understood in its guise as the “Barber Paradox”, which asks, “If the barber shaves everyone who does not shave themselves, then who shaves the barber?”

Ludwig Wittgenstein (1889 - 1951)
Wittgenstein was a student of Russell who had cast off the trappings of a wealthy Austrian upbringing to become one of the most enigmatic and colorful philosophers of the twentieth century. Wittgenstein was involved in that age-old philosophical quest that still underlies A.I. – the search for the perfect language (effectively an unambiguous Language of Thought, or LOT). Rather than using words according their position in an underlying logical framework, people often give meaning to words according to their context, (“words mean what we use them to mean”). Wittgenstein was to reject the classical model of categorization that has existed since Plato, and suggest that categories do not have well-defined logical boundaries. Rather, categories are open-textured, blurring into each other via shared features and common associations. Likewise, categories are not judged according to a shared logical model, but a prototype that exemplifies the most salient features of the category (e.g., there is no common definition of game that defines all the games one can think of, yet we can recognize new games be comparing them to what we believe the prototypical game(s) to be, such as Chess). In Wittgenstein’s terms, the members of a category are not strongly bound by some inviolable logical criteria, but an altogether looser set of ‘family resemblances’.

Kurt Gödel (1906-1978)
Gödel was the Czech mathematician who in 1931 finally sent the whole Hilbert-Russell-Whitehead enterprise tumbling. It was in this year he published his infamous ‘Incompleteness Theorem’, an ingenious and wonderfully elegant piece of mathematical reasoning that demonstrated serious limitations in models of formal reasoning that are at least as powerful as Arithmetic (i.e., powerful enough to describe the properties of the natural numbers). Gödel proved that such a formal system that was complete would necessarily by unsound, while if such a system is sound it must necessarily be incomplete. An unsound system is one that contains a contradiction (for instance, ‘the moon is round and the moon is not round’), and unfortunately, for mathematics, even a single contradiction in an otherwise architectural masterpiece spells structural doom for the entire system. Such is the seriousness of this discovery that Gödel feared for his life, experiencing bouts of paranoia that frequently led him to believe his food had been poisoned.

Alan Turing (1912-1954)
Turing is perhaps most famous for his proposed ‘Turing Test’, an unashamedly behaviorist basis for determining the intelligence or otherwise of an A.I. program. In this test, a human judge converses with two or more participants in a teletype-driven conversion that precludes the judge recognizing any of the participants. If an A.I. program can be covertly employed as one of the participants without the judge uncovering the subterfuge, the computer can be said to have passed the test and exhibited an intellect comparable to that of a human. The conversation at the heart of the test is of course meant to be free ranging without being confined to a predetermined domain of discourse. While this is not the most satisfactory basis for defining machine intelligence (something less behaviorist being preferable), it makes no prior assumptions about the nature of intelligence and is thus arguably the best criterion that we currently possess.

Claude Shannon (1916-2001)
In 1950, Shannon made one of the opening gambits of modern A.I., by elaborating the concept of an intelligent chess-playing computer. Though Alan Turing had already introduced this idea, and even designed a chess-program that he simulated by hand (it still beat him!), it was Shannon who put the enterprise on a solid footing by also introducing the idea of game-trees (for look-ahead) and static evaluation functions (for determining the value of individual board positions).

Allen Newell (1927-1992) and Herbert Simon (1916-2001)
Newell and Simon are the two names most associated with a major theme of A.I. known as the ‘Physical Symbol-Symbol Hypothesis’, or PSSH. This hypothesis, which underpins the entirety of the symbolic A.I. agenda as epitomized by almost all work in Lisp and Prolog, claims that human cognition can be adequately modeled in terms of a system of symbols and strategies for manipulating those symbols. A symbol is merely a token, such as Egg or Blob, and should not to be confused with words, which are orthographic or phonetic stand-ins (often ambiguously so) for such symbols. A symbol is an atomic token then, which possesses no internal structure or intrinsic meaning. Rather, when used as part of a larger interpretative system a symbol can act as a referent for an idea in the mind or a physical entity in the real world. In this sense, a symbol is like a dollar bill or a pound note; neither of these things has any intrinsic monetary value in themselves, but simply serve as promissory notes for another economic quantity. Like symbols however, pieces of paper currency derive their meaning from the system of exchange in which they are used, so just as a piece of currency can eventually be used to buy a physical object like a chair, a symbol can eventually be used as the motivating basis for a physical action, such as sitting in that chair.

John McCarthy
The label ‘Artificial Intelligence’ was coined by the mathematician and computer scientist John McCarthy in 1956, to describe the infant field of research that first established itself in the Dartmouth meeting of the same year. As discussed earlier, McCarthy is also responsible for giving A.I. its lingua franca, the Lisp programming language, in 1958. In addition, in the same year he published a highly influential A.I. paper entitled Programs with Common Sense, in which he described a hypothetical, logic-based Advice-Taker system that extended upon the ideas proposed by Newell and Simon. Proposals such as this not only demonstrated the potential of the A.I. field, but also laid the groundwork for the commercial successes that were to spring from A.I. in the guise of expert systems in the 1960s and 70s. McCarthy is also known for his contribution to more purist models of logic, having introduced the idea of Circumscription in 1977 to handle exceptions and defaults (i.e., mechanized guesswork) in systems of logical reasoning.

Marvin Minsky
An early collaborator of McCarthy’s, Marvin Minsky was to turn away from the ‘neat’ logicist approach to A.I. and embrace a more psychology-motivated and program-oriented route to machine intelligence. Minsky’s own Ph.D. thesis concerned the hardware implementation of a “Neural Network”—a highly connected system of simple computational units that together mimics the low-level architecture of the human brain. Minsky’s most salient contribution to A.I. is the notion of Frame Representation. A frame is a bundle of related information concerning a single topic, making all information on that topic retrievable from a centralized location. Frames are thus structured objects (much like C++ classes) that can be organized in taxonomic hierarchies, allowing information stored in more abstract frames to be inherited by frames farther down the hierarchy. Additionally, procedural attachments called daemons can be hooked into various parts of a frame; these attachments become active whenever that part of the frame is accessed or changed, allowing the frame system to perform automatic inferring about these changes.

John Searle
Searle is a twentieth-century philosopher who is perhaps most famous for his vehement arguments denying the possibility of real machine intelligence, and the enigmatic thought experiment—the Chinese Room argument—which he advances in defense of his position. The Chinese Room is an example of a philosophical proof by contradiction. It is the very antithesis of the Turing Test, being an experiment that attempts to demonstrate the hollowness of a behaviorist conception of intelligence by showing that an entity that passed such a test could still be woefully unintelligent with respect to its domain of discourse. It also represents a strong argument against the PSSH of Newell and Simon, purporting to show that ‘mere symbol pushing’ cannot possibly underlie human intelligence.