Home | Publications | CS Home

Max-Prob: An Unbiased Rational Decision Making Procedure for Multiple-Adversary Environments


Anat Hashavit and Shaul Markovitch. Max-Prob: An Unbiased Rational Decision Making Procedure for Multiple-Adversary Environments. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence, 222-227 Barcelona, Spain, 2011.


Abstract

In binary-utility games, an agent can have only two possible utility values for final states, 1 (win) and 0 (lose). An adversarial binary-utility game is one where for each final state there must be at least one winning and one losing agent. We define an unbiased rational agent as one that seeks to maximize its utility value, but is equally likely to choose between states with the same utility value. This induces a probability distribution over the outcomes of the game, from which an agent can infer its probability to win. A single adversary binary game is one where there are only two possible outcomes, so that the winning probabilities remain binary values. In this case, the rational action for an agent is to play minimax. In this work we focus on the more complex, multiple-adversary environment. %where an agent is met with at least two adversaries. We propose a new algorithmic framework where agents try to maximize their winning probabilities. We begin by theoretically analyzing why an unbiased rational agent should take our approach in an unbounded environment and not that of the existing paranoid or maxn algorithms. We then expand our framework to a resource-bounded environment, where winning probabilities are estimated, and show empirical results supporting our claims.


Keywords: Multi-Agent Systems, Games, Learning in Games, Heuristic Search
Secondary Keywords:
Online version:
Bibtex entry:
 @inproceedings{Hashavit:2011:MPU,
  Author = {Anat Hashavit and Shaul Markovitch},
  Title = {Max-Prob: An Unbiased Rational Decision Making Procedure for Multiple-Adversary Environments},
  Year = {2011},
  Booktitle = {Proceedings of the 22nd International Joint Conference on Artificial Intelligence},
  Pages = {222-227},
  Address = {Barcelona, Spain},
  Url = {http://ijcai.org/papers11/Papers/IJCAI11-048.pdf},
  Keywords = {Multi-Agent Systems, Games, Learning in Games, Heuristic Search},
  Abstract = {
    In binary-utility games, an agent can have only two possible
    utility values for final states, 1 (win) and 0 (lose). An
    adversarial binary-utility game is one where for each final state
    there must be at least one winning and one losing agent. We define
    an unbiased rational agent as one that seeks to maximize its
    utility value, but is equally likely to choose between states with
    the same utility value. This induces a probability distribution
    over the outcomes of the game, from which an agent can infer its
    probability to win. A single adversary binary game is one where
    there are only two possible outcomes, so that the winning
    probabilities remain binary values. In this case, the rational
    action for an agent is to play minimax. In this work we focus on
    the more complex, multiple-adversary environment. %where an agent
    is met with at least two adversaries. We propose a new algorithmic
    framework where agents try to maximize their winning
    probabilities. We begin by theoretically analyzing why an unbiased
    rational agent should take our approach in an unbounded
    environment and not that of the existing paranoid or maxn
    algorithms. We then expand our framework to a resource-bounded
    environment, where winning probabilities are estimated, and show
    empirical results supporting our claims.
  }

  }