Carmel Domshlak, Erez Karpas and Shaul Markovitch. To Max or Not to Max: Online Learning for Speeding Up Optimal Planning. In *Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence*, 1071-1076 Atlanta, Georgia, 2010.

It is well known that there cannot be a single "best" heuristic for optimal planning in general. One way of overcoming this is by combining admissible heuristics (e.g. by using their maximum), which requires computing numerous heuristic estimates at each state. However, there is a tradeoff between the time spent on computing these heuristic estimates for each state, and the time saved by reducing the number of expanded states. We present a novel method that reduces the cost of combining admissible heuristics for optimal search, while maintaining its benefits. Based on an idealized search space model, we formulate a decision rule for choosing the best heuristic to compute at each state. We then present an active online learning approach for that decision rule, and employ the learned model to decide which heuristic to compute at each state. We evaluate this technique empirically, and show that it substantially outperforms each of the individual heuristics that were used, as well as their regular maximum.

@inproceedings{Domshlak:2010:MNM, Author = {Carmel Domshlak and Erez Karpas and Shaul Markovitch}, Title = {To Max or Not to Max: Online Learning for Speeding Up Optimal Planning}, Year = {2010}, Booktitle = {Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence}, Pages = {1071--1076}, Address = {Atlanta, Georgia}, Url = {http://www.cs.technion.ac.il/~shaulm/papers/pdf/Domshlak-Karpas-Markovitch-AAAI2010.pdf}, Keywords = {Planning, Speedup Learning}, Abstract = { It is well known that there cannot be a single "best" heuristic for optimal planning in general. One way of overcoming this is by combining admissible heuristics (e.g. by using their maximum), which requires computing numerous heuristic estimates at each state. However, there is a tradeoff between the time spent on computing these heuristic estimates for each state, and the time saved by reducing the number of expanded states. We present a novel method that reduces the cost of combining admissible heuristics for optimal search, while maintaining its benefits. Based on an idealized search space model, we formulate a decision rule for choosing the best heuristic to compute at each state. We then present an active online learning approach for that decision rule, and employ the learned model to decide which heuristic to compute at each state. We evaluate this technique empirically, and show that it substantially outperforms each of the individual heuristics that were used, as well as their regular maximum. } }