Trading performance for stability in Markov decision processes Journal Article

Author(s): Brázdil, Tomáš; Chatterjee, Krishnendu; Forejt, Vojtěch; Kučera, Antonín
Article Title: Trading performance for stability in Markov decision processes
Affiliation IST Austria
Abstract: We study controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize the expected mean-payoff performance and stability (also known as variability in the literature). We argue that the basic notion of expressing the stability using the statistical variance of the mean payoff is sometimes insufficient, and propose an alternative definition. We show that a strategy ensuring both the expected mean payoff and the variance below given bounds requires randomization and memory, under both the above definitions. We then show that the problem of finding such a strategy can be expressed as a set of constraints.
Keywords: Controller synthesis; Markov Decision Processes; Stability; mean payoff; Stochastic systems
Journal Title: Journal of Computer and System Sciences
Volume: 84
ISSN: 0022-0000
Publisher: Elsevier  
Date Published: 2017-03-01
Start Page: 144
End Page: 170
Copyright Statement: CC BY 4.0
DOI: 10.1016/j.jcss.2016.09.009
Notes: T. Brázdil and A. Kučera are supported by the Czech Science Foundation, grant No. 15-17564S. K. Chatterjee is supported by the Austrian Science Fund (FWF) Grant No. P 23499-N23; FWF NFN Grant No. S11407-N23 (RiSE); ERC Start grant (279307: Graph Games); Microsoft faculty fellows award. V. Forejt is supported by EPSRC project EP/M023656/1.
Open access: yes (OA journal)
IST Austria Authors