1st Edition

Constrained Markov Decision Processes

256 Pages

by Chapman & Hall

Description

This book provides a unified approach for the study of constrained Markov decision processes with a finite state space and unbounded costs. Unlike the single controller case considered in many other books, the author considers a single controller with several objectives, such as minimizing delays and loss, probabilities, and maximization of throughputs. It is desirable to design a controller that minimizes one cost objective, subject to inequality constraints on other cost objectives. This framework describes dynamic decision problems arising frequently in many engineering fields. A thorough overview of these applications is presented in the introduction.
The book is then divided into three sections that build upon each other.
The first part explains the theory for the finite state space. The author characterizes the set of achievable expected occupation measures as well as performance vectors, and identifies simple classes of policies among which optimal policies exist. This allows the reduction of the original dynamic into a linear program. A Lagranian approach is then used to derive the dual linear program using dynamic programming techniques.
In the second part, these results are extended to the infinite state space and action spaces. The author provides two frameworks: the case where costs are bounded below and the contracting framework.
The third part builds upon the results of the first two parts and examines asymptotical results of the convergence of both the value and the policies in the time horizon and in the discount factor. Finally, several state truncation algorithms that enable the approximation of the solution of the original control problem via finite linear programs are given.

INTRODUCTION
Examples of Constrained Dynamic Control Problems
On Solution Approaches for CMDPs with Expected Costs
Other Types of CMDPs
Cost Criteria and Assumptions
The Convex Analytical Approach and Occupation Measures
Linear Programming and Lagrangian Approach for CMDPs
About the Methodology
The Structure of the Book
PART ONE: FINITE MDPS
MARKOV DECISION PROCESSES
The Model
Cost Criteria and the Constrained Problem
Some Notation
The Dominance of Markov Policies
THE DISCOUNTED COST
Occupation Measure and the Primal LP
Dynamic Programming and Dual LP: the Unconstrained Case
Constrained Control: Lagrangian Approach
The Dual LP
Number of Randomizations
THE EXPECTED AVERAGE COST
Occupation Measure and the Primal LP
Equivalent Linear Program
The Dual Program
Number of Randomizations
FLOW AND SERVICE CONTROL IN A SINGLE-SERVER QUEUE
The Model
The Lagrangian
The Original Constrained Problem
Structure of Randomization and Implementation Issues
On Coordination Between Controllers
Open Questions
PART TWO: INFINITE MDPS
MDPS WITH INFINITE STATE AND ACTION SPACES
The Model
Cost Criteria
Mixed Policies, and Topologic Structures
The Dominance of Markov Policies
Aggregation of States
Extra Randomization in the Policies
Equivalent Quasi-Markov Model and Quasi-Markov Policies
THE TOTAL COST: CLASSIFICATION OF MDPS
Transient and Absorbing MDPs
MDPs With Uniform Lyapunov Functions
Equivalence of MDP With Unbounded and bounded costs
Properties of MDPs With Uniform Lyapunov Functions
Properties for Fixed Initial Distribution
Examples of Uniform Lyapunov Functions
Contracting MDPs
THE TOTAL COST: OCCUPATION MEASURES AND THE PRIMAL LP
Occupation Measure
Continuity of Occupation Measures
More Properties of MDPs
Characterization of Achievable Sets of Occupation Measure
Relation Between Cost and Occupation Measure
Dominating Classes of Policies
Equivalent Linear Program
The Dual Program
THE TOTAL COST: DYNAMIC AND LINEAR PROGRAMMING
Non-Constrained Control: Dynamic and Linear Programming
Superharmonic Functions and Linear Programming
Set of Achievable Costs
Constrained Control: Lagrangian Approach
The Dual LP
State Truncation
A Second LP Approach for Optimal Mixed Policies
More on Unbound Costs
THE DISCOUNTED COST
The Equivalent Total Cost Model
Occupation Measure and LP
Non-negative Immediate Cost
Weak Contracting Assumptions and Lyapunov Functions
Example: Flow and Service Control
THE EXPECTED AVERAGE COST
Occupation Measures
Completeness Properties of Stationary Policies
Relation Between Cost and Occupation Measure
Dominating Classes of Policies
Equivalent Linear Program
The Dual Program
The Contracting Framework
Other Conditions for the Uniform Integrability
The Case of Uniform Lyapunov Conditions
EXPECTED AVERAGE COST: DYNAMIC PROGRAMMING AND LP
The Non-Constrained Case: Optimality Inequality
Non-Constrained Control: Cost Bounded Below
Dynamic Programming and Uniform Lyapunov Function
Super-Harmonic Functions and Linear Programming
Set of Achievable Costs
Constrained Control: Lagrangian Approach
The Dual LP
A Second LP Approach for Optimal Mixed Policies
PART THREE: ASYMPTOTIC METHODS AND APPROXIMATIONS
SENSITIVITY ANALYSIS
Introduction
Approximation of the Values
Approximation and Robustness of the Policies
CONVERGENCE OF DISCOUNTED CONSTRAINED MDPS
Convergence in the Discount Factor
Convergence to the Expected Average Cost
The Case of Uniform Lyapunov Function
CONVERGENCE AS THE HORIZON TENDS TO INFINITY
The Discounted Cost
The Expected Average Cost: Stationary Policies
The Expected Average Cost: General Policies
STATE TRUNCATION AND APPROXIMATION
The Approximating sets of States
Scheme I: the Total Cost
Scheme II: the Total Cost
Scheme III: the Total Cost
The Expected Average Cost
Infinite MDPs: on the Number of Randomizations
APPENDIX: CONVERGENCE OF PROBABILITY MEASURES
REFERENCES
LIST OF SYMBOLS AND NOTATION
INDEX

Author(s)

Biography

Altman, Eitan

Critics' Reviews

"…an outstanding addition to the MDPs literature and it complements…".

tstanding addition to the MDPs literature and it complements…".

Add to Cart

Constrained Markov Decision Processes

Description

Table of Contents

Author(s)

Biography

Critics' Reviews