Markov decision processes with multiple objectives Conference Paper


Author(s): Chatterjee, Krishnendu; Majumdar, Ritankar S; Henzinger, Thomas A
Title: Markov decision processes with multiple objectives
Title Series: LNCS
Affiliation
Abstract: We consider Markov decision processes (MDPs) with multiple discounted reward objectives. Such MDPs occur in design problems where one wishes to simultaneously optimize several criteria, for example, latency and power. The possible trade-offs between the different objectives are characterized by the Pareto curve. We show that every Pareto-optimal point can be achieved by a memoryless strategy; however, unlike in the single-objective case, the memoryless strategy may require randomization. Moreover, we show that the Pareto curve can be approximated in polynomial time in the size of the MDP. Additionally, we study the problem if a given value vector is realizable by any strategy, and show that it can be decided in polynomial time; but the question whether it is realizable by a deterministic memoryless strategy is NP-complete. These results provide efficient algorithms for design exploration in MDP models with multiple objectives. This research was supported in part by the AFOSR MURI grant F49620-00-1-0327, and the NSF grants CCR-0225610, CCR-0234690, and CCR-0427202.
Conference Title: STACS: Theoretical Aspects of Computer Science
Volume: 3884
Conference Dates: February 23-25, 2006
Conference Location: Marseille, France
ISBN: 978-3-540-67141-1
Publisher: Springer  
Location: Berlin, Heidelberg
Date Published: 2006-02-14
Start Page: 325
End Page: 336
Sponsor: This research was supported in part by the AFOSR MURI grant F49620-00-1-0327, and the NSF grants CCR-0225610, CCR-0234690, and CCR-0427202.
DOI: 10.1007/11672142_26
Open access: no
IST Austria Authors
  1. Thomas A. Henzinger
    415 Henzinger
Related IST Austria Work