|
Abstract : |
We consider the complexity of the decision problem for different types of partially-observable Markov decision processes (MDPs): given an MDP, does there exist a policy with performance? 0? Lower and upper bounds on the complexity of the decision problems are shown in terms of completeness for NL, P, NP, PSPACE, EXP, NEXP or EXPSPACE, dependent on the type of the Markov decision process. For several NP-complete types, we show that they are not even polynomial time "-approximable for any fixed ", unless P = NP. These results also reveal interesting trade-offs between the power of policies, observations, and rewards. Topics: computational and structural complexity, computational issues in A.I., |