David Carmel and Shaul Markovitch. Learning and Using Opponent Models in Adversary Search. Technical Report CIS9609, Technion, 1996.
While human players adjust their playing strategy according to their opponent, computer programs, which are based on the minimax algorithm, use the same playing strategy against a novice as against an expert. This is due to the assumption of minimax that the opponent uses the same strategy as the player. This work studies the problem of opponent modeling in game playing. M*, an algorithm for searching game-trees using an opponent model is described and analyzed. We demonstrate experimentally the benefit of using an opponent model and show the potential harm caused by the use of an inaccurate model. We then describe an algorithm, M*-epsilon, for using uncertain models when a bound on the model error is known. Pruning in M* is impossible in the general case. We prove a sufficient condition for pruning and present a pruning algorithm, alpha-beta*, that returns the M* value of a tree, searching only necessary subtrees. Finally, we present a method for acquiring a model for an unknown player. First, we describe a learning algorithm that acquires a model of the opponent's depth of search by using its past moves as examples. Then, an algorithm for acquiring a model of the player's strategy, both depth and function, is described and evaluated. Experiments with this algorithm show that when a superset of the set of features used by a fixed opponent is available to the learner, few examples are sufficient for learning a model that agrees almost perfectly with the opponent.
@techreport{Carmel:1996:LUO,
Author = {David Carmel and Shaul Markovitch},
Title = {Learning and Using Opponent Models in Adversary Search},
Year = {1996},
Number = {CIS9609},
Type = {Technical Report},
Institution = {Technion},
Url = {http://www.cs.technion.ac.il/~shaulm/papers/pdf/Carmel-Markovitch-CIS9609.pdf},
Keywords = {Opponent Modeling, Learning in Games, Games, Heuristic Search, Multi-Agent Systems},
Secondary-keywords = {M*, Mstar, Mstar-epsilon, Pruning, Learning Evaluation Functions, OM Search, Adversary Search,},
Abstract = {
While human players adjust their playing strategy according to
their opponent, computer programs, which are based on the minimax
algorithm, use the same playing strategy against a novice as
against an expert. This is due to the assumption of minimax that
the opponent uses the same strategy as the player. This work
studies the problem of opponent modeling in game playing. M*, an
algorithm for searching game-trees using an opponent model is
described and analyzed. We demonstrate experimentally the benefit
of using an opponent model and show the potential harm caused by
the use of an inaccurate model. We then describe an algorithm,
M*-epsilon, for using uncertain models when a bound on the model
error is known. Pruning in M* is impossible in the general case.
We prove a sufficient condition for pruning and present a pruning
algorithm, alpha-beta*, that returns the M* value of a tree,
searching only necessary subtrees. Finally, we present a method
for acquiring a model for an unknown player. First, we describe a
learning algorithm that acquires a model of the opponent's depth
of search by using its past moves as examples. Then, an algorithm
for acquiring a model of the player's strategy, both depth and
function, is described and evaluated. Experiments with this
algorithm show that when a superset of the set of features used by
a fixed opponent is available to the learner, few examples are
sufficient for learning a model that agrees almost perfectly with
the opponent.
}
}