ALGORITHMS - MINIMAX

Search algorithms tend to utilize a cause-and-effect concept--the search considers each possible action available to it at a given moment; it then considers its subsequent moves from each of those states, and so on, in an attempt to find terminal states which satisfy the goal conditions it was given. Upon finding a goal state, it then follows the steps it knows are necessary to achieve that state.

However, in a competitive multi-player game, when other agents are involved and have different goals in mind (and agents usually in fact have goals which directly oppose each other), things get more complicated. Even if a search algorithm can find a goal state, it usually cannot simply take a set of actions which will reach that state, since for every action our algorithm takes towards its goal, the opposing player can take an action which will alter the current state (presumably in a way unfavorable to our algorithm.) This does not mean searches are useless in finding strategies for multi-player games; they simply require additional tactics to be effective.

For two player games, the minimax algorithm is such a tactic, which uses the fact that the two players are working towards opposite goals to make predictions about which future states will be reached as the game progresses, and then proceeds accordingly to optimize its chance of victory. The theory behind minimax is that the algorithm's opponent will be trying to minimize whatever value the algorithm is trying to maximize (hence, "minimax"). Thus, the computer should make the move which leaves its opponent capable of doing the least damage.

In the ideal case (i.e., the computer has infinite time and infinite storage capacity), the computer will investigate every game outcome possible from the game's current state (as well as the paths taken to reach those states) and assign those outcomes values. For a win-or-lose game like chess or tic-tac-toe, there are only three possible values-win, lose, or draw (often assigned numeric values of 1, -1, and 0 respectively)--but for other games like backgammon or poker with a "score" to maximize, the scores themselves are used. Then, starting from the bottom, the computer evaluates which possible outcome is best for the computer's opponent. It then assumes that, if that game stage is reached, its opponent will make the move which leads to the outcome best for the opponent (and worse for the computer). Thus, it can "know" what its opponent will do, and have a concrete idea of what the final game state will be if that second-to-last position is in fact reached. Then, the computer can treat that position of a terminal node of that value, even though it is not actually a terminal value. This process can then be repeated a level higher, and so on. Ultimately, each option that the computer currently has available can be a assigned a value, as if it were a terminal state, and the computer simply picks the highest value and takes that action. A relatively simple example of this is shown below:

Fig. 1 (picture from http://perso.wanadoo.fr/alemanni/page5_e.html)

In the above example (Fig. 1), assuming normally alternating turns, if the computer has the choice at level A, its opponent will have the choice at B, and the computer will again have the choice at C. Since the computer is trying to maximize its score, it can know ahead of time what it would choose should any given C node be reached. C1 thus effectively has a score of 5, C2, 11, C3, 8, and so on. When the opponent has a choice to make at B, however, they will choose the path that leads to the C node with the lowest effective score. Thus, the score of each B node can be thought to be the minimum of the effective scores of the C-nodes it leads to. For example, B1's score would be 5 (the minimum of 5, 11, and 8, as calculated above). B2 and B3 can be calculated in a similar fashion. Finally, we are back to the current turn. The computer now knows what will come of choosing B1, B2, or B3; and, though the actual endgame is many turns away, it will choose the maximum of those three for the best possible result. Note that, if the opponent does not behave as predicted, the calculation can simply be re-run, taking the current state as the starting node, and a result as good (or better) than what was predicted will still be achieved.

A major problem with this approach is how pessimistic it is. In a trivial example like the one above, minimax is useful because it is a reasonable expectation that the computer's opponent can figure out what its best options are; in more complex games, however, this will not be so clear, and a computer running the minimax algorithm may sacrifice major winnings because it assumes its opponent will "see" a move or series of moves which could defeat it, even if those moves are actually quite counterintuitive--in short, the computer assumes it is playing an opponent as knowledgeable as itself.

In reality, however, an exhaustive use of the minimax algorithm, as shown above, tends to be hopelessly impractical--and, for many win-or-lose games, uninteresting. A computer can compute all possible outcomes for a relatively simple game like tic-tac-toe (disregarding symmetric game states, there are 9! = 362,880 possible outcomes; the X player has a choice of 9 spaces, then the O has a choice of 8, etc.), but it won't help. As veteran tic-tac-toe players know, there is no opening move which guarantees victory; so, a computer running a minimax algorithm without any sort of enhancements will discover that, if both it and its opponent play optimally, the game will end in a draw no matter where it starts, and thus have no clue as to which opening play is the "best." Even in more interesting win-or-lose games like chess, even if a computer could play out every possible game situation (a hopelessly impossible task), this information alone would still lead it to the conclusion that the best it can ever do is draw (which would in fact be true, if both players had absolutely perfect knowledge of all possible results of each move). It is only for games where an intelligent player is guaranteed victory by going first or second (such as Nim, in many forms) that minimax will prove sufficient.

However, minimax is still useful when employed with heuristics which approximate possible outcomes from a certain point. For a game like chess, for instance, pieces are often assigned values to approximate their relative strengths and usefulness for winning the game (queen = 9, rook = 5, etc.). So, while a computer may not be able to envision all possible situations down to checkmate from the first move, it may be able to tell which move is more likely to lead to a material advantage.

Games with random elements (such as backgammon) also make life difficult for minimax algorithms. In these cases, an "expectimax" approach is taken. Using the probabilities of certain moves being available to each player (the odds of dice combinations, for example) in addition to their conflicting goals, a computer can compute an "expected value" for each node it can reach from the current game state. As actual game events take place, the possibilities must be frequently refreshed; yet, this approach still allows for intelligent moves to be made.