Nathan J. Bloom - Homepage

Using ID3 for Game Playing AI

By Nathan J. Bloom
Written on May 2, 2010

To account for the large number of possible situations that can arise during even the simplest game involving two players, adversarial search techniques (such as the MIMMAX algorithm) have been devised to provide efficient AI-based opponents. However, at their core, an adversarial search algorithm often works off of a static heuristic. The decisions an adversarial search algorithm produces only exist within the domain established by the heuristic. Other categories of AI exist, based off decision tree learning. Algorithms in this category, such as ID3, work with a table of observational data and discrete result actions to produce dynamic (and hopefully short) decision trees to the discrete result action. While an adversarial search produces decisions within the domain of a heuristic, a decision tree learning algorithm's domain is limited only by the training data it receives by a human being. With the aim of providing an AI that behaves more like a human, this document will attempt to describe a game playing approach using the ID3 algorithm.

To keep things (relatively) simple move-wise, the game of checkers has been chosen. Taking place on an 8x8 grid with 12 pieces for both players (red and black), checkers has a potential of (5 x 10^20) different moves between both players (Sreedhar 1). However, each individual piece is limited to initially only moving forward to either a left or right diagonal. A piece's movement can be further limited by a "block", in which two of an opponent's pieces exist in a diagonal row that the player wishes to move in. Players take turns capturing pieces by "jumping" diagonally over a piece of the opposite color. The number of possible moves a player makes increases if one of their pieces reaches the opposite side of the board, which causes the piece to be "kinged". The King piece can then move forwards and backwards in a diagonal fashion. There are a number of possible strategies a heuristic can take into account, which can be grouped into attacking (importance placed on capturing an opponent's pieces through single or multiple jumps), or defense (importance placed on limiting an opponent's movements by increasing the number of blocks or controlling the center of the board). For an adversarial searching algorithm, a heuristic would need to account for different types of attack and defense strategies. However, an ID3-based algorithm can leave the strategy design up to the AI trainer.

While uncertainty is a factor in finding potential moves, ID3 can still be a suitable candidate for a checkers algorithm. As a decision tree algorithm, ID3 is suited for instances involving attribute-value pairs. A potential checkers algorithm could consider a number of attributes. One possible candidate is board configuration: either in its entirety, as a representation of the positions of both players, or as Booleans for individual squares. The number of pieces on the board in it its entirety, or divided among both players can also be considered. ID3 is suited towards discrete output values. In the case of a checkers algorithm, the discrete question would be whether or not a particular move should be made given the state of the board. Decision tree algorithms recognize that training data can contain errors. Certainly, in a game of checkers, disadvantageous moves can be made by players, causing them to loose pieces or a defensive advantage. These disadvantageous moves should be factored into training data as errors to better simulate a more human-like AI. Finally, ID3 as a decision tree algorithm recognizes that there will be missing values for attributes. An untrained ID3-based AI in checkers would naturally not know every possible move to make. It is even arguable that a trained and challenging checkers AI using ID3 would not be able to account for every single possibility. For the purposes of simulating a more human-like AI this is not a problem, as a human would not be able to visualize all 5 x 10^20 possible move combinations, only the moves more applicable to their current situation.

Some careful consideration will need to be taken for the ID3 approach. Considering that entropy and information gain values will need to be calculated, the first key decision will need to be in table design. Each row in the table will correspond to a possible board state configuration of opponent player and AI pieces. Each square in the 8x8 grid is given its own column. Two additional columns are added to the table: one for the number of opponent player pieces still on the board and one column for the number of AI pieces still on the board. A final column is added, corresponding to the discrete action: whether or not to make the move. This table design is not the only (or particularly best) choice. However, it should provide a relatively robust decision tree for the AI to work with.

In order to populate the table, the AI would behave in two different ways during play: either in game mode, in which it decides which move to make next, or in learning mode, in which it populates the table with game observations and discrete actions. Learning mode can be handled in at least two different ways. The player and AI can enter a separate game mode, called a "training mode", in which a normal game of checkers plays out. The AI plays with what it knows from its table, and the player can respond with positive or negative feedback based on its actions. When the AI receives feedback, it records the present board state according to the table design, and whether or not the feedback was positive or negative. If feedback is received on a board state that has already been recorded, then the discrete action is overwritten with the new discrete action received. While a training mode will allow a player more freedom to train an AI, it also isn't very natural. Therefore, the second way of handling learning is to allow the AI to learn as the game is played. As the game continues, triggers are used to signal the AI to record a board state and a positive or negative discrete action. For example, if the AI looses a piece, a board state is record with a discrete action of "No" in regards to making the move again. If the AI's move causes the player to loose a piece, the board state is record with a discrete action of "Yes."

During "game mode," the AI acts on the data it has collected. In an effort to create an AI that can continually adapt at all stages of the game, a search tree is generated through the ID3 algorithm at the beginning of each AI turn. This process starts with a table pruning. To remove any potential nodes that cannot be traversed to due to an insufficient number of pieces, the program runs through the table, removing all rows in which the recorded red and black pieces exceed the number of red and black pieces still on the board. After these rows are pruned, entropy and information gain are calculated for every column. Finally, the search tree is built according to which attributes have the most information gain. To make a move, the AI traverses the decision tree, finding the first node containing a positive discrete action. If no positive discrete action can be found, a valid move is chosen at random. This process cycles through until either the player or AI wins.

ID3 can be an attractive alternative to the exponential growth of the MINMAX algorithm. The MINMAX algorithm has a quadratic time complexity of O(b^m), where b = the number of legal moves at each point and m = the maximum depth of the tree (Russel and Norvig 165). Conversely, the ID3 algorithm has a more attractive linear time complexity of O(ba(p+n)), where b = the maximum number of positive values for an attribute, a = the number of attributes, p = the number of positive rows and n = the number of negative rows (Wu 35). At its worst, ID3's time complexity can reach Omega(ba^2(p+n)^2) (Wu 35) and approach the same quadratic growth in as the MINMAX algorithm. Therefore, in the interest of increasing performance, modifications of the "game mode" algorithm may need to be considered. One possible strategy is to do the entropy and information gain calculations once, construct a decision tree, and use that decision tree for the entirety of the game. However, as pieces on both the player and AI side are removed from the game, many pathways will be rendered invalid and need to be pruned. Furthermore, as ID3 prefers the construction of short trees, it is possible that the resultant decision tree would not produce positive discrete actions as the game progresses into states that may not be in the resultant decision tree. Another strategy is to keep generating decision trees during each round of play, but use fewer attributes to generate the tree. For example, the piece positions could be compressed into two columns: one for red pieces and one for black pieces. However, if there are too few attributes, the decision tree may not be as comprehensive. If the decision tree is to be generated only once, it might be worthwhile to traverse the tree using the MINMAX algorithm, using the positive (+1) or negative (-1) discrete actions as utility values. Time complexity would definitely be a concern, however, if the ID3 calculations produced a deep decision tree.

While the MINMAX algorithm is suited for adversarial search in game playing, it can be process intensive. When designed properly, ID3 can be a suitable choice over the MINMAX algorithm. If the search trees produced by ID3 are small enough, the MINMAX algorithm could be utilized to make the smartest move. The largest advantage to using the ID3 algorithm over pure MINMAX is the potential to create a more unique strategy among multiple AI. Two players may not train their AI the same way through game playing. From a game standpoint, it might even be interesting to pit two trained AI together to see which one is better trained. At the start, an ID3-based game playing algorithm may not pose much of a challenge to a player, but with time and training, an ID3-based AI could become more than a match for a human opponent, while utilizing more human strategies.

Works Cited

   Norvig, Peter and Russel, Stewart. Artificial Intelligence: A Modern Approach. 3rd ed. New Jersey: Pearson Education, Inc., 2010.

   Sreedhar, Suhas. "Checkers, Solved!" IEEE: Spectrum. July 2007 (http://spectrum.ieee.org/computing/software/checkers-solved)

   Wu, XIndong. Knowledge Acquisition from Databases. New Jersey: Ablex Publishing Co., 1995