Last time, I wrote about how I decided to revisit Tic Tac Toe. Like most programmers, I too wrote a program some years back that could play the game without losing. But that version was based simply on a set of rules. This time, I moved further and used implementations of Minimax and Negamax algorithms in addition to rule based play.
I had to do some reading on these algorithms and started with wikipedia. While the entry for minimax provided accurate pseudo code, the one for negamax was plain wrong. I have accumulated quite a few links on these algorithms and their variations (mostly minimax and alpha-beta pruning). However, from among them, An Introduction to Game Tree Algorithms by Hamed Ahmadi Nejad is the best one. It explains minimax, negamax and alpha-beta pruning in a clear and concise manner. Minimax Explained is another good source of information on these algorithms. But it does not cover negamax. In any case, negamax is nothing but minimax in disguise (in fact, it is a more elegant version of the same algorithm).
I am releasing the source code and binaries (the program has been written in C# 2.0 and requires the .net 2.0 framework to be installed) for my version of Tic Tac Toe under GNU GPL v3. Download it from box.net.
The source contains two projects – Tic Tac Toe and Tic Tac Toe Stats. While the first project is the game, the second one constructs a game tree and generates stats like how many legal games are there, who wins how many times, how many truly unique games are there if you exclude board rotations, and symmetrical positions etc etc.
The program uses one of the following strategies when the computer is playing –
- rule based play.
- a modified version of minimax that takes advantage of the static nature of the game.
- standard minimax.
- ply-restricted minimax search.
- standard negamax.
The best thing is that these strategies can be changed from a menu during game play to see how they perform comparatively. I have not implemented alpha-beta pruning based strategies of minimax and negamax because the optimisation is not as simple as it appears. For ABP to work, the depth first search should have discovered a leaf node with a value which makes searching the current node irrelevant. If that does not happen, in the worst case, time wise, ABP will give results equivalent to minimax or negamax.
Ply-restricted search and the evaluation function
If you have downloaded and tried the application, you would have noticed a ‘Minimax Play With Four Ply Look Ahead’ menu item in the Strategy menu. Tic Tac Toe is not a complex game and hence the entire game tree for the 255,168 legal games can be generated within a couple of seconds. But other games like Chess, Draughts and Reversi/ Othello are not that simple and generating the complete game tree is out of the question. So, on each move, a tree is generated with the current board state as the root and a depth cut-off of say 6 or 12 plies being provided, a ply being a move made by one player.
The four-ply strategy demonstrates the effect of a depth cut-off on tic tac toe if we limit the computer’s knowledge to only four moves from the current board state. The following four board positions reflect the progress of the game after moves 3, 4, 5 and 6 with the human player moving first (x) and the computer second (o).
The reader may ask why the computer did not prevent x from completing column three and instead capture the centre square? The real question that needs to be asked is why did the computer capture the square on row one-column two on its second move? For, that is where it lost the plot.
The computer is destined to lose the game because the evaluation function that it depends on to make the right moves which works wonderfully well when standard minimax or negamax is used, fails horribly when it comes to the ply restricted version of the algorithm. The function scans the board position at the leaf node and returns +1 if x is winning, -1 if o is winning and 0 if there is a draw or tie. While this behaviour is fine when it scans the ‘real’ leaf nodes in the full game tree, it hardly helps when it is doing it on a game tree only four plies deep. Since it may not find any one winning at that level, the function always returns 0, assuming that the game is ending in a draw.
And so, the computer looks at the values for each of the next moves, finds that all moves lead to a draw, and so picks the first move that it finds. It is only after x’s third move does it have access to the real terminal positions of the game and knows that it has erred badly. From here on, whatever it does, as long as x plays a flawless game, x is bound to win. So, again, since it hardly matters what move it makes, it picks the first move it finds, which is the one that captures the centre square. This shows why an evaluation function is super critical in minimax based algorithms, more so when there is a depth cut off involved.
Now that Tic Tac Toe is done to my satisfaction, I plan to return to Pallankuzhi. Or I might take a look at Draughts or Reversi.