When a Decision Tree Learner Has Plenty of Time

Saher Esmeir and Shaul Markovitch. When a Decision Tree Learner Has Plenty of Time. In Proceedings of the Twenty-First National Conference on Artificial Intelligence, 1597-1600 Boston, MA, 2006.

Abstract

The majority of the existing algorithms for learning decision trees are greedy---a tree is induced top-down, making locally optimal decisions at each node. In most cases, however, the constructed tree is not globally optimal. Furthermore, the greedy algorithms require a fixed amount of time and are not able to generate a better tree if additional time is available. To overcome this problem, we present a lookahead-based algorithm for anytime induction of decision trees which allows trading computational speed for tree quality. The algorithm uses a novel strategy for evaluating candidate splits; a stochastic version of ID3 is repeatedly invoked to estimate the size of the tree in which each split results, and the one that minimizes the expected size is preferred. Experimental results indicate that for several hard concepts, our proposed approach exhibits good anytime behavior and yields significantly better decision trees when more time is available.

Keywords: Resource-Bounded Reasoning, Anytime Learning

Secondary Keywords:

Online version:

Bibtex entry:

 @inproceedings{Esmeir:2006:WDT,
  Author = {Saher Esmeir and Shaul Markovitch},
  Title = {When a Decision Tree Learner Has Plenty of Time},
  Year = {2006},
  Booktitle = {Proceedings of the Twenty-First National Conference on Artificial Intelligence},
  Pages = {1597--1600},
  Address = {Boston, MA},
  Url = {http://www.cs.technion.ac.il/~shaulm/papers/pdf/Esmeir-Markovitch-aaai2006-WDT.pdf},
  Keywords = {Resource-Bounded Reasoning, Anytime Learning},
  Secondary-keywords = {Decision Trees},
  Abstract = {
    The majority of the existing algorithms for learning decision
    trees are greedy---a tree is induced top-down, making locally
    optimal decisions at each node. In most cases, however, the
    constructed tree is not globally optimal. Furthermore, the greedy
    algorithms require a fixed amount of time and are not able to
    generate a better tree if additional time is available. To
    overcome this problem, we present a lookahead-based algorithm for
    anytime induction of decision trees which allows trading
    computational speed for tree quality. The algorithm uses a novel
    strategy for evaluating candidate splits; a stochastic version of
    ID3 is repeatedly invoked to estimate the size of the tree in
    which each split results, and the one that minimizes the expected
    size is preferred. Experimental results indicate that for several
    hard concepts, our proposed approach exhibits good anytime
    behavior and yields significantly better decision trees when more
    time is available.
  }

  }