Author(s):

Chatterjee, Krishnendu; Chmelík, Martin

Article Title: 
POMDPs under probabilistic semantics

Affiliation 
IST Austria 
Abstract: 
We consider partially observable Markov decision processes (POMDPs) with limitaverage payoff, where a reward value in the interval [0,1] is associated with every transition, and the payoff of an infinite path is the longrun average of the rewards. We consider two types of path constraints: (i) a quantitative constraint defines the set of paths where the payoff is at least a given threshold λ1ε(0,1]; and (ii) a qualitative constraint which is a special case of the quantitative constraint with λ1=1. We consider the computation of the almostsure winning set, where the controller needs to ensure that the path constraint is satisfied with probability 1. Our main results for qualitative path constraints are as follows: (i) the problem of deciding the existence of a finitememory controller is EXPTIMEcomplete; and (ii) the problem of deciding the existence of an infinitememory controller is undecidable. For quantitative path constraints we show that the problem of deciding the existence of a finitememory controller is undecidable. We also present a prototype implementation of our EXPTIME algorithm and experimental results on several examples.

Keywords: 
Almostsure winning; Limitaverage objectives; POMDPs

Journal Title:

Artificial Intelligence

Volume: 
221

ISSN:

00043702

Publisher:

Elsevier

Date Published:

20150401

Start Page: 
46

End Page:

72

URL: 

DOI: 
10.1016/j.artint.2014.12.009

Open access: 
yes (repository) 