Saher Esmeir and Shaul Markovitch. When a Decision Tree Learner Has Plenty of Time. In Proceedings of the Twenty-First National Conference on Artificial Intelligence, 1597-1600 Boston, MA, 2006.
The majority of the existing algorithms for learning decision trees are greedy---a tree is induced top-down, making locally optimal decisions at each node. In most cases, however, the constructed tree is not globally optimal. Furthermore, the greedy algorithms require a fixed amount of time and are not able to generate a better tree if additional time is available. To overcome this problem, we present a lookahead-based algorithm for anytime induction of decision trees which allows trading computational speed for tree quality. The algorithm uses a novel strategy for evaluating candidate splits; a stochastic version of ID3 is repeatedly invoked to estimate the size of the tree in which each split results, and the one that minimizes the expected size is preferred. Experimental results indicate that for several hard concepts, our proposed approach exhibits good anytime behavior and yields significantly better decision trees when more time is available.
@inproceedings{Esmeir:2006:WDT, Author = {Saher Esmeir and Shaul Markovitch}, Title = {When a Decision Tree Learner Has Plenty of Time}, Year = {2006}, Booktitle = {Proceedings of the Twenty-First National Conference on Artificial Intelligence}, Pages = {1597--1600}, Address = {Boston, MA}, Url = {http://www.cs.technion.ac.il/~shaulm/papers/pdf/Esmeir-Markovitch-aaai2006-WDT.pdf}, Keywords = {Resource-Bounded Reasoning, Anytime Learning}, Secondary-keywords = {Decision Trees}, Abstract = { The majority of the existing algorithms for learning decision trees are greedy---a tree is induced top-down, making locally optimal decisions at each node. In most cases, however, the constructed tree is not globally optimal. Furthermore, the greedy algorithms require a fixed amount of time and are not able to generate a better tree if additional time is available. To overcome this problem, we present a lookahead-based algorithm for anytime induction of decision trees which allows trading computational speed for tree quality. The algorithm uses a novel strategy for evaluating candidate splits; a stochastic version of ID3 is repeatedly invoked to estimate the size of the tree in which each split results, and the one that minimizes the expected size is preferred. Experimental results indicate that for several hard concepts, our proposed approach exhibits good anytime behavior and yields significantly better decision trees when more time is available. } }