David Carmel and Shaul Markovitch. Learning and Using Opponent Models in Adversary Search. Technical Report CIS9609, Technion, 1996.
While human players adjust their playing strategy according to their opponent, computer programs, which are based on the minimax algorithm, use the same playing strategy against a novice as against an expert. This is due to the assumption of minimax that the opponent uses the same strategy as the player. This work studies the problem of opponent modeling in game playing. M*, an algorithm for searching game-trees using an opponent model is described and analyzed. We demonstrate experimentally the benefit of using an opponent model and show the potential harm caused by the use of an inaccurate model. We then describe an algorithm, M*-epsilon, for using uncertain models when a bound on the model error is known. Pruning in M* is impossible in the general case. We prove a sufficient condition for pruning and present a pruning algorithm, alpha-beta*, that returns the M* value of a tree, searching only necessary subtrees. Finally, we present a method for acquiring a model for an unknown player. First, we describe a learning algorithm that acquires a model of the opponent's depth of search by using its past moves as examples. Then, an algorithm for acquiring a model of the player's strategy, both depth and function, is described and evaluated. Experiments with this algorithm show that when a superset of the set of features used by a fixed opponent is available to the learner, few examples are sufficient for learning a model that agrees almost perfectly with the opponent.
@techreport{Carmel:1996:LUO, Author = {David Carmel and Shaul Markovitch}, Title = {Learning and Using Opponent Models in Adversary Search}, Year = {1996}, Number = {CIS9609}, Type = {Technical Report}, Institution = {Technion}, Url = {http://www.cs.technion.ac.il/~shaulm/papers/pdf/Carmel-Markovitch-CIS9609.pdf}, Keywords = {Opponent Modeling, Learning in Games, Games, Heuristic Search, Multi-Agent Systems}, Secondary-keywords = {M*, Mstar, Mstar-epsilon, Pruning, Learning Evaluation Functions, OM Search, Adversary Search,}, Abstract = { While human players adjust their playing strategy according to their opponent, computer programs, which are based on the minimax algorithm, use the same playing strategy against a novice as against an expert. This is due to the assumption of minimax that the opponent uses the same strategy as the player. This work studies the problem of opponent modeling in game playing. M*, an algorithm for searching game-trees using an opponent model is described and analyzed. We demonstrate experimentally the benefit of using an opponent model and show the potential harm caused by the use of an inaccurate model. We then describe an algorithm, M*-epsilon, for using uncertain models when a bound on the model error is known. Pruning in M* is impossible in the general case. We prove a sufficient condition for pruning and present a pruning algorithm, alpha-beta*, that returns the M* value of a tree, searching only necessary subtrees. Finally, we present a method for acquiring a model for an unknown player. First, we describe a learning algorithm that acquires a model of the opponent's depth of search by using its past moves as examples. Then, an algorithm for acquiring a model of the player's strategy, both depth and function, is described and evaluated. Experiments with this algorithm show that when a superset of the set of features used by a fixed opponent is available to the learner, few examples are sufficient for learning a model that agrees almost perfectly with the opponent. } }