Saher Esmeir and Shaul Markovitch. When a Decision Tree Learner Has Plenty of Time. In Proceedings of the Twenty-First National Conference on Artificial Intelligence, 1597-1600 Boston, MA, 2006.
The majority of the existing algorithms for learning decision trees are greedy---a tree is induced top-down, making locally optimal decisions at each node. In most cases, however, the constructed tree is not globally optimal. Furthermore, the greedy algorithms require a fixed amount of time and are not able to generate a better tree if additional time is available. To overcome this problem, we present a lookahead-based algorithm for anytime induction of decision trees which allows trading computational speed for tree quality. The algorithm uses a novel strategy for evaluating candidate splits; a stochastic version of ID3 is repeatedly invoked to estimate the size of the tree in which each split results, and the one that minimizes the expected size is preferred. Experimental results indicate that for several hard concepts, our proposed approach exhibits good anytime behavior and yields significantly better decision trees when more time is available.
@inproceedings{Esmeir:2006:WDT,
Author = {Saher Esmeir and Shaul Markovitch},
Title = {When a Decision Tree Learner Has Plenty of Time},
Year = {2006},
Booktitle = {Proceedings of the Twenty-First National Conference on Artificial Intelligence},
Pages = {1597--1600},
Address = {Boston, MA},
Url = {http://www.cs.technion.ac.il/~shaulm/papers/pdf/Esmeir-Markovitch-aaai2006-WDT.pdf},
Keywords = {Resource-Bounded Reasoning, Anytime Learning},
Secondary-keywords = {Decision Trees},
Abstract = {
The majority of the existing algorithms for learning decision
trees are greedy---a tree is induced top-down, making locally
optimal decisions at each node. In most cases, however, the
constructed tree is not globally optimal. Furthermore, the greedy
algorithms require a fixed amount of time and are not able to
generate a better tree if additional time is available. To
overcome this problem, we present a lookahead-based algorithm for
anytime induction of decision trees which allows trading
computational speed for tree quality. The algorithm uses a novel
strategy for evaluating candidate splits; a stochastic version of
ID3 is repeatedly invoked to estimate the size of the tree in
which each split results, and the one that minimizes the expected
size is preferred. Experimental results indicate that for several
hard concepts, our proposed approach exhibits good anytime
behavior and yields significantly better decision trees when more
time is available.
}
}