Learning Club


Lecture Title:

Planning in Markov Decision Processes using Sparse Sampling Methods



Time and Place: Thursday, 18.5.2000, 11:30, Taub 337

Speaker: Yishay Mansour

Affiliation: Department of Computer Science, Tel-Aviv University

Abstract:

In this work we investigate planing in Markov Decision Processes (MDP) whose state space is exponential (or infinite). To describe the MDP, we assume a generative model that receives a state and action, and outputs the next state (according to the distribution given by the MDP).

We show that, given a generative model, one can compute efficiently a near-optimal policy (in time which is independent from the size of the underlying MDP). Our technique can be viewed as a form of sparse sampling in an MDP.

Similarly, we address the question of finding a near-optimal policy from a limited class of policies. We show that for a finite class of policies, it is sufficient that the number of queries to the generative model grows only logarithmically in the number of policies. We also show that for an infinite class of policies, one can use the VC-dimension to bound the number of queries to the generative model. The talk will not assume any prior knowledge about Markov Decision Processes.

This is a joint work with Michael Kearns (AT&T) and Andrew Ng (U. C. Berkeley)