Markov decision processes with multiple long-run average objectives Conference Paper


Author(s): Chatterjee, Krishnendu
Title: Markov decision processes with multiple long-run average objectives
Title Series: LNCS
Affiliation
Abstract: We consider Markov decision processes (MDPs) with multiple long-run average objectives. Such MDPs occur in design problems where one wishes to simultaneously optimize several criteria, for example, latency and power. The possible trade-offs between the different objectives are characterized by the Pareto curve. We show that every Pareto optimal point can be epsilon-approximated by a memoryless strategy, for all epsilon > 0. In contrast to the single-objective case, the memoryless strategy may require randomization. We show that the Pareto curve can be approximated (a) in polynomial time in the size of the MDP for irreducible MDPs; and (b) in polynomial space in the size of the MDP for all MDPs. Additionally, we study the problem if a given value vector is realizable by any strategy, and show that it can be decided in polynomial time for irreducible MDPs and in NP for all MDPs. These results provide algorithms for design exploration in MDP models with multiple long-run average objectives.
Conference Title: FSTTCS: Foundations of Software Technology and Theoretical Computer Science
Volume: 4855
Conference Dates: December 12-14, 2007
Conference Location: New Delhi, India
Publisher: Schloss Dagstuhl - Leibniz-Zentrum für Informatik  
Date Published: 2007-11-27
Start Page: 473
End Page: 484
Sponsor: This research was supported by the NSF grants CCR-0225610 and CCR-0234690
DOI: 10.1007/978-3-540-77050-3_39
Open access: no
IST Austria Authors
Related IST Austria Work