Value iteration for long run average reward in markov decision processes Conference Paper

Author(s): Ashok, Pranav; Chatterjee, Krishnendu; Daca, Przemysław; Křetínský, Jan; Meggendorfer, Tobias
Title: Value iteration for long run average reward in markov decision processes
Title Series: LNCS
Affiliation IST Austria
Abstract: Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Long-run average rewards provide a mathematically elegant formalism for expressing long term performance. Value iteration (VI) is one of the simplest and most efficient algorithmic approaches to MDPs with other properties, such as reachability objectives. Unfortunately, a naive extension of VI does not work for MDPs with long-run average rewards, as there is no known stopping criterion. In this work our contributions are threefold. (1) We refute a conjecture related to stopping criteria for MDPs with long-run average rewards. (2) We present two practical algorithms for MDPs with long-run average rewards based on VI. First, we show that a combination of applying VI locally for each maximal end-component (MEC) and VI for reachability objectives can provide approximation guarantees. Second, extending the above approach with a simulation-guided on-demand variant of VI, we present an anytime algorithm that is able to deal with very large models. (3) Finally, we present experimental results showing that our methods significantly outperform the standard approaches on several benchmarks.
Conference Title: CAV: Computer Aided Verification
Volume: 10426
Conference Dates: July 24 - 28, 2017
Conference Location: Heidelberg, Germany
Publisher: Springer  
Date Published: 2017-01-01
Start Page: 201
End Page: 221
DOI: 10.1007/978-3-319-63387-9_10
Notes: This work is partially supported by the Vienna Science and Technology Fund (WWTF) ICT15-003, the Austrian Science Fund (FWF) NFN grant No. S11407-N23 (RiSE/SHiNE), the ERC Starting grant (279307: Graph Games), the German Research Foundation (DFG) project “Verified Model Checkers”, the TUM International Graduate School of Science and Engineering (IGSSE) project PARSEC, and the Czech Science Foundation grant No. 15-17564S.
Open access: yes (repository)
IST Austria Authors
  1. Przemysław, Daca
    12 Daca
Related IST Austria Work