Learning and Using Opponent Models in Adversary Search

David Carmel and Shaul Markovitch. Learning and Using Opponent Models in Adversary Search. Technical Report CIS9609, Technion, 1996.

Abstract

While human players adjust their playing strategy according to their opponent, computer programs, which are based on the minimax algorithm, use the same playing strategy against a novice as against an expert. This is due to the assumption of minimax that the opponent uses the same strategy as the player. This work studies the problem of opponent modeling in game playing. M*, an algorithm for searching game-trees using an opponent model is described and analyzed. We demonstrate experimentally the benefit of using an opponent model and show the potential harm caused by the use of an inaccurate model. We then describe an algorithm, M*-epsilon, for using uncertain models when a bound on the model error is known. Pruning in M* is impossible in the general case. We prove a sufficient condition for pruning and present a pruning algorithm, alpha-beta*, that returns the M* value of a tree, searching only necessary subtrees. Finally, we present a method for acquiring a model for an unknown player. First, we describe a learning algorithm that acquires a model of the opponent's depth of search by using its past moves as examples. Then, an algorithm for acquiring a model of the player's strategy, both depth and function, is described and evaluated. Experiments with this algorithm show that when a superset of the set of features used by a fixed opponent is available to the learner, few examples are sufficient for learning a model that agrees almost perfectly with the opponent.

Keywords: Opponent Modeling, Learning in Games, Games, Heuristic Search, Multi-Agent Systems

Secondary Keywords:

Online version:

Bibtex entry:

 @techreport{Carmel:1996:LUO,
  Author = {David Carmel and Shaul Markovitch},
  Title = {Learning and Using Opponent Models in Adversary Search},
  Year = {1996},
  Number = {CIS9609},
  Type = {Technical Report},
  Institution = {Technion},
  Url = {http://www.cs.technion.ac.il/~shaulm/papers/pdf/Carmel-Markovitch-CIS9609.pdf},
  Keywords = {Opponent Modeling, Learning in Games, Games, Heuristic Search, Multi-Agent Systems},
  Secondary-keywords = {M*, Mstar, Mstar-epsilon, Pruning, Learning Evaluation Functions, OM Search, Adversary Search,},
  Abstract = {
    While human players adjust their playing strategy according to
    their opponent, computer programs, which are based on the minimax
    algorithm, use the same playing strategy against a novice as
    against an expert. This is due to the assumption of minimax that
    the opponent uses the same strategy as the player. This work
    studies the problem of opponent modeling in game playing. M*, an
    algorithm for searching game-trees using an opponent model is
    described and analyzed. We demonstrate experimentally the benefit
    of using an opponent model and show the potential harm caused by
    the use of an inaccurate model. We then describe an algorithm,
    M*-epsilon, for using uncertain models when a bound on the model
    error is known. Pruning in M* is impossible in the general case.
    We prove a sufficient condition for pruning and present a pruning
    algorithm, alpha-beta*, that returns the M* value of a tree,
    searching only necessary subtrees. Finally, we present a method
    for acquiring a model for an unknown player. First, we describe a
    learning algorithm that acquires a model of the opponent's depth
    of search by using its past moves as examples. Then, an algorithm
    for acquiring a model of the player's strategy, both depth and
    function, is described and evaluated. Experiments with this
    algorithm show that when a superset of the set of features used by
    a fixed opponent is available to the learner, few examples are
    sufficient for learning a model that agrees almost perfectly with
    the opponent.
  }

  }