Technical Report CS0187

Title: Semi-markov Decision Processes with Polynomial Reward
Authors: Zvi Rosberg
Abstract: A semi-Markov decision process with a denumerable multidimensional state space is considered. At any given state only a finite number of actions can be taken to control the process. The immediate reward earned in one transition period is merely assumed to be bounded by a polynomial and a bound is imposed on a weighted, moment of the next state reached in one tranaition. It is shown that under an erogicity assumption there is a stationary optimal policy for the long-run average reward criterion. A queueing network scheduling problem is given as an application.
