Shannon's work on AI
According to Shannon, games are quite significant because they test computers in something that is not a numerical problem. Programming computers to perform non-numerical tasks widens our understanding of the capabilities of computers. This wider use of computers suggests useful changes in their design. Also, research of game playing will lead to insights in the way the brain operates.
The chess playing machine
In 1949 Shannon built the first machine that played chess at MIT. In 1950 he wrote two papers explaining his reasoning. The question of a chess-playing machine was of interest to Shannon because of its theoretical value. Building a chess-playing machine was something new in that the entities dealt with were not numbers and that the procedure involved a general principle of trial and error and not a strict unchangeable computer process. A satisfactory solution to the problem would help in dealing with other, more significant, problems of AI, such as translations, logical deductions etc. Shannon started with chess because:
Shannon set out to prove that a computer could indeed play a good game of chess by the use of a suitable program.
- the problem had clear rules and a clear goal
- it was neither too simple nor too hard
- it required "thinking" to be played thus it could be used to test the ability of the machine to think
- its discrete structure suited the digital nature of computers
Building the machine
The first thing needed is a suitable code for the chessboard and pieces. Each square and each piece are uniquely numbered. White pieces have positive numbers and black pieces negative numbers. Each move is also numbered accordingly. For instance 01, 22 means that a piece moves from square 1 to 22.
Shannon then used an evaluation function f(p) which can be applied to a position P and whose value gives the category of the position, i.e. if it's a win, a lost or a draw. Using this function theoretically it is possible to build a machine that plays perfectly. For each position the machine would consider all possible moves to the end of the game. However, a move for white and then for black gives about 103 possibilities. If a game lasts for 40 moves, the machine would have to consider about 10120 variations and that would take years.
Shannon concluded that the aim was not to make a machine that played perfect chess, but a machine that played skillfully, like a human player.
Shannon used several principles of chess to design a good evaluation function. An example of such a formula:
K,Q,R are the number of White kings, queens, rooks etc.
K', Q', R' are the numbers of Black kings, queens, rooks etc.
The coefficients are estimated values for the worth of each piece.
Strategy type A using f(p)
Let M1, M2 etc be the moves that can be made from position p and let M1P, M2P etc be the resulting positions when M1, M2 etc are applied to p. From all the moves, Mn, the machine should choose the one that maximizes f(MnP). The same logic is applied when considering the opponent's moves. In this case, the machine must choose the move such that f is a maximum after the opponent has chosen his best reply. Following the figure (fig. 1): the point at the left is the starting position.
Fig. 1 - How the computer chooses its next best move|
It is assumed that there are three possible moves for the White and if any of these is made there will be three possible moves for Black. The possible positions after both a White and a Black move are the nine points on the right and the numbers are the evaluations of these positions. The White should choose the move that will leave it at the best position possible after Black makes its best move. In the figure the +0.1 is Whiteıs best choice.
Using this strategy, the machine can consider all variations up to a definite number of moves using the formula.
This strategy according to Shannon would be weak and slow. It would take a long time to evaluate each position (109 evaluations after three moves). It would be weak because it can only see a few moves deep. In fig. 1 it computes all variations to exactly three moves.
A human player examines only a few selected variations and carries these to a reasonable level. Thus, the machine must select the best variations, examine them as far as possible and evaluate only at reasonable positions where some stability has been established. This is strategy B.
For strategy B we need a function g(p) that given a position p it can determine whether approximate stability exists. A definition:
g(p) = 1 if a piece is attacked by a piece of lower value, or by more pieces or if a check exists.
g(p) = 0 otherwise
Variations are explored until g(p)=0.
We also need a function h(p,m) to decide whether a move m in position p is worth exploring. The function should be given large values for forceful moves (checks, captures, and attacks) and developing moves, medium values for defensive moves and low values for other moves. The function will choose the best variation. As it gets deeper into the variation, the requirements will increase, thus, less and less variations will be selected. Whenever, two or more moves are of the same value the machine will choose among them randomly.
The strategy is divided in nine subprograms and a masterprogram. Their functions are:
S0 - makes a move in position P
S1 - makes a list of the allowed moves of a pawn
S2 - S6 - same for knights, bishops, rooks, queen and king
S7 - makes list of all possible moves in a given position
S8 - calculates f(p) for a given position p
S9 - the masterprogram determines the proper move
The operation runs as follows: S9 calls S7. S7 makes a list of all possible moves by calling on previous subprograms to determine where the various pieces can move. S9 evaluates the resulting positions by calling S8. It then compares the results and picks the best move, which is typed out.
In playing chess the machine acts as though it were thinking since skillful chess play requires reason.
The game of Hex
Shannon along with E.F. Moore designed a machine that played a game of Hex. The game is played on a board laid out in a hexagonal pattern. Two players alternate in playing men on the hexagons. One player uses yellow men and the other blue. The yellow player tries to form a connected path of yellow men from the top of the board to the bottom. The blue player tries to form a connected path between the two sides of the board. Shannon and Moore represented the board by a resistance network using positive voltage to signify yellow men and negative for blue. Certain saddle points in thew field represented good moves. To test this theory an analog device was made consisting of a resistance network and equipment to locate the saddle points. Indeed, the machine that resulted played a pretty good game. When it had first move it would win 70% of the time. With second move its score was around 50%.
A popular definition of learning is "Any change in a system that allows it to perform better the second time on repetition of the same task or on another task drawn from the same population" (Herbert Simon 1983). The key feature of this definition is skill refinement, which is achieved through the acquisition and application of knowledge. All intelligent organisms must be able to learn in order to adapt and survive. Consequently the ability to learn is a key indicator of intelligence and so a basic requirement of artificially intelligent systems.
The maze-solving machine
At the 8th Cybernetics Conference in 1952, Claude Shannon presented his maze-solving machine. This was a machine that could solve a maze by trial and error means. It could then remember the solution and also forget it in case the situation changed and the solution was no longer applicable.
The machine consisted of a maze at its top panel with a range of 5X5 squares. This maze could be changed by rearranging the partitions between the 25 squares. In the maze there was a sensing finger (mouse) that could feel the partitions of the maze as it came against them. This finger was moved by two motors, an east-west motor and a north-south motor (fig. 2).
Fig. 2 - How the maze-solving machine works|
The problem the machine had to solve was to move the finger through the maze to a specified goal. The goal would be mounted on a pin, which could be slipped into a jack in any of the 25 squares. The goal could be changed anytime.
The machine at work
Once the machine is turned on, the finger moves around exploring the maze. When it reaches the center of a square it has to decide on the next direction to follow. If it hits a partition, the motors reverse, taking the finger back to the center of the square where a new direction is chosen. The choices are based on previous knowledge. When the finger eventually reaches its goal, the motors stop. If at this point the finger is moved back to its original position, it remembers the solution it found so when it is turned on it goes directly to the goal without exploring. If the finger is moved to a part of the maze that it didnıt explore, it will move around until it reaches a known region. From there it will go directly to its goal.
How does it work?
There are two modes of operation in the machine: the exploration strategy and the goal strategy.
Exploration strategy: this is used when the finger is first trying to find the goal. Each square in the maze has a memory associated with it that consists of two relays. These are capable of remembering one of four possible directions: east, west, north or south. The direction that is remembered for each square is the direction that the finger followed the last time after visiting the square. This is the only piece of data the machine remembers about the finger's path and it is the only thing it needs to reproduce the path later on. In fact, each square is seen as a vector. For each of the 25 squares, the memory of the machine retains a vector field defined over the maze. As the finger moves through the maze, it continually revises the remembered vector field so that the vectors point along possible paths of the maze leading to the point currently occupied by the finger.
For instance, the finger leaves a square D in the eastern direction. This eastern direction is placed in the memory. When it returns back to square D it will rotate 90o counterclockwise to pick a new direction. In this case, it will try the northern direction next and this new direction will now enter the memory. If it hits a barrier and returns, it will again rotate to try the westerly direction etc. The machine can also remember the direction by which it came into the square at the current visit so as it rotates it skips that direction and avoids repetitions.
Goal strategy: When the finger reaches the goal, a relay locks the machine causing it to act according to the goal strategy. In the goal strategy, the finger takes as its first choice direction D, which is the direction by which it left the square on its last visit. This cancels out all blind alleys and circular paths. Since a blind alley must be left by way of the same square through which it was entered, the direction D retained for that square will necessarily lead to the goal directly rather than through a detour. If the machine follows a circular re-entrant path in exploring its way to the goal, the direction retained for the last fork in this path must be that going to the goal rather than somewhere else. Thus, the machine follows a direct path to the goal after it has first found it.
The feature of forgetting is achieved as follows: if after finding the goal the finger is moved to an unknown part of the maze, or if the maze is changed, the machine starts counting the number of moves it makes. If it does not reach the goal within 24 moves, it decides that the maze has been changed or that it is in a circular loop and that the previous solution is no longer relevant. The circuit then reverts to the exploration strategy. It explores around by trial and error until it reaches known territory or until it reaches the goal. If there is no goal the machine will keep exploring all squares again and again.
Shannon's machine can solve any maze whether it be simply or multiply connected. An interesting point about the machine is that even though the finger can go from the origin to the goal, it cannot remember how to get from the goal back to the origin. The vector field goes only in the direction of the vectors so that the finger does not know where it came from. It only focuses on the goal and thatıs how it can find it from anywhere. Itıs like a man who knows the town so he can go from any place to any other place but doesnıt always remember how he went.
Alternatively the finger could be a permanently magnetized mouse driven by an electromagnet found beneath the maze. The motion of the electromagnet is controlled by a relay circuit containing about 110 relays, organized into a memory and a computing circuit. Shannon devised such a machine (fig. 3) and he named the mouse Theseus, after the Greek hero that managed to find his way through the labyrinth of the king of Crete and kill the Minotaur.
Fig. 3 - Shannon with Theseus|
The mind-reading machine
In 1953 Shannon worked on the development of a mind-reading machine. His colleague D. W. Hagelbarger built a machine that played the game of matching pennies. On the front panel of the machine are a start button and two lights marked + and -. To play against the machine the player would have to guess out loud either "right" or "left". The center button of the machine would then be pressed and the machine would light up either the right or left light. If the machine matches the player, the machine wins otherwise the player wins. The player should then move the key switch in the direction corresponding to the choice he made. The machine would register a win for the appropriate party by shooting a ball into the proper glass tube. The overall score is shown on the two counters visible through the front panel.
The strategy of operation
The machine looks for certain types of patterns in the behavior of the human player. When it finds these patterns, it remembers them and assumes that the player will follow the same patterns the next time the same situation arises. The machine also contains a random element. Until patterns have been found, or if an assumed pattern is not repeated at least twice by the player, the machine makes a choice at random.
The types of patterns remembered involve the outcome of two successive plays and whether the player changed his choice between them and after them. There are 8 possible situations and two things the player can do at each one:
- player wins, plays same, wins. Then same or differently.
- player wins, plays same, loses. Then same or differently.
- player wins, plays differently, wins. Then same or differently.
- player wins, plays differently, loses.Then same or differently.
- player loses, plays same, wins.Then same or differently.
- player loses, plays same, loses.Then same or differently.
- player loses, plays differently, wins.Then same or differently.
- player loses, plays differently, loses.Then same or differently.
Each of these possibilities corresponds to a different cell in the memory of the machine. Within each cell two things are registered: 1. Whether the last time this situation arose the player played the same or differently, and 2. whether or not the behavior indicated in 1 was a repeat of the same behavior in the previous such situation.
Example: Consider the situation win, same, lose (2). If the last time this situation arose the player played "differently", then "differently" is recorded in the first part of the memory cell. If the preceding time this situation arose the player also chose "differently", the second part of the memory records this as a repetition. The machine will assume that when this situation arises again the player will choose "differently", following the same pattern. If there is no repetition of a choice the machine plays randomly. The memory cells are always kept up to date so that their content may change as different patterns are repeated.
The player can actually defeat the machine if he keeps track of the content of all the memory cells of the machine. He should repeat a pattern twice and then when the machine is ready to follow this pattern, he should alter it. This is very difficult to do, however. A lot of calculations and a lot of time are needed.
After a lot of discussion between Shannon and Hagelbarger as to which machine was the best, the machines were finally played against each other. A small third machine acted as the interface between the two machines. All three machines were plugged together and allowed to run for a few hours while everybody in the room placed bets and cheered. Shannon's machine won in a ratio of 55:45.
Fig. 4 - The mind-reading machine|