Abstract
We consider a gradual-impulse control problem of continuous-time Markov decision processes, where the system performance is measured by the expectation of the exponential utility of the total cost. We show, under natural conditions on the system primitives, the existence of a deterministic stationary optimal policy out of a more general class of policies that allow multiple simultaneous impulses, randomized selection of impulses with random effects, and accumulation of jumps. After characterizing the value function using the optimality equation, we reduce the gradual-impulse control problem to an equivalent simple discrete-time Markov decision process, whose action space is the union of the sets of gradual and impulsive actions.
Original language | English |
---|---|
Pages (from-to) | 301-334 |
Number of pages | 34 |
Journal | Advances in Applied Probability |
Volume | 53 |
Issue number | 2 |
DOIs | |
Publication status | Published - 1 Jul 2021 |
Bibliographical note
Funding Information:We thank the editors and referees for comments and remarks that significantly improved the readability of this paper. This work was supported by the Royal Society (grant number IE160503) and the Daiwa Anglo-Japanese Foundation (UK) (grant reference 4530/12801).
Publisher Copyright:
© 2021 Cambridge University Press. All rights reserved.
Keywords
- Continuous-time Markov decision processes
- dynamic programming
- gradual-impulse control
- optimality equation
ASJC Scopus subject areas
- Statistics and Probability
- Applied Mathematics