Markov decision processes with multiple long-run average objectives Journal Article


Author(s): Brázdil, Tomáš; Brožek, Václav; Chatterjee, Krishnendu; Forejt, Vojtěch; Kučera, Antonín
Article Title: Markov decision processes with multiple long-run average objectives
Affiliation IST Austria
Abstract: We study Markov decision processes (MDPs) with multiple limit-average (or mean-payoff) functions. We consider two different objectives, namely, expectation and satisfaction objectives. Given an MDP with κ limit-average functions, in the expectation objective the goal is to maximize the expected limit-average value, and in the satisfaction objective the goal is to maximize the probability of runs such that the limit-average value stays above a given vector. We show that under the expectation objective, in contrast to the case of one limit-average function, both randomization and memory are necessary for strategies even for ε-approximation, and that finite-memory randomized strategies are sufficient for achieving Pareto optimal values. Under the satisfaction objective, in contrast to the case of one limit-average function, infinite memory is necessary for strategies achieving a specific value (i.e. randomized finite-memory strategies are not sufficient), whereas memoryless randomized strategies are sufficient for ε-approximation, for all ε > 0. We further prove that the decision problems for both expectation and satisfaction objectives can be solved in polynomial time and the trade-off curve (Pareto curve) can be ε-approximated in time polynomial in the size of the MDP and 1/ε, and exponential in the number of limit-average functions, for all ε > 0. Our analysis also reveals flaws in previous work for MDPs with multiple mean-payoff functions under the expectation objective, corrects the flaws, and allows us to obtain improved results.
Keywords: Formal verification; Markov Decision Processes; Mean-payoff reward; Multi-objective optimisation
Journal Title: Logical Methods in Computer Science
Volume: 10
Issue 1
ISSN: 1860-5974
Publisher: International Federation of Computational Logic  
Date Published: 2014-02-14
Start Page: paper 13
Copyright Statement: CC-BY
URL:
DOI: 10.2168/LMCS-10(1:13)2014
Notes: T. Brazdil is supported by the Czech Science Foundation, grant No P202/12/P612. K. Chatterjee is supported by the Austrian Science Fund (FWF) Grant No P 23499-N23; FWF NFN Grant No S11407-N23 (RiSE); ERC Start grant (279307: Graph Games); Microsoft faculty fellows award. V. Forejt is supported by a Royal Society Newton Fellowship and EPSRC project EP/J012564/1.
Open access: yes (OA journal)
IST Austria Authors
Related IST Austria Work