Process / pipelineSimulation / optimization

Multi-objective Markov Model — Sequential Decision-Making Across Competing Objectives

Multi-objective Markov Decision Process Model · Also known as: MOMDP, Multi-objective MDP, Multi-criteria Markov Decision Process, MO-Markov Model

A Multi-objective Markov Model (MOMDP) extends classical Markov Decision Processes to settings where an agent must optimize several reward signals simultaneously. Instead of a single optimal policy, the model produces a Pareto-optimal set of policies, enabling decision-makers to navigate trade-offs between competing goals such as cost, risk, and throughput over time.

Tools & resources

Download slides

Learn & explore

Read the full method

Members only

Method map

The neighbourhood of related methods — select a node to explore.

Multi-objective Markov Model

Markov Model Multi-objective dynamic…Multi-Objective Optimiza…Stochastic Dynamic Progr…Stochastic Markov Model

When to use it

Use a Multi-objective Markov Model when decisions unfold over multiple time steps, outcomes are stochastic, and at least two objectives conflict — for example, reliability vs. cost in infrastructure planning, or efficacy vs. side-effects in treatment scheduling. It is appropriate when transition dynamics can be estimated from historical data or expert elicitation and when a Pareto front is more informative than a single aggregated score. Do NOT use it when objectives can be collapsed into a single metric without loss of stakeholder information, when the state space is too large for tractable computation, when no reliable transition probability estimates exist, or when a one-shot (non-sequential) decision suffices — in that case prefer simpler multi-criteria methods such as TOPSIS or weighted sum.

Strengths & limitations

Strengths

Explicitly models stochastic dynamics, capturing how uncertainty propagates over sequential decisions rather than treating it as noise.
Produces a Pareto-optimal policy set, preserving the trade-off structure so decision-makers can apply preferences after, not before, solving the model.
Scales to long planning horizons through dynamic programming, avoiding combinatorial explosion over time steps.
Supports heterogeneous objectives (cost in dollars, risk in probability, throughput in units) without requiring a common scale.
Compatible with sensitivity analysis on preference weights, making it transparent for stakeholder deliberation.

Limitations

State space explosion: even moderate systems can have millions of states, making exact Pareto Value Iteration computationally prohibitive without approximation.
Requires accurate transition probability estimates, which are difficult to obtain for novel or data-sparse systems.
The Pareto front can be exponentially large in the number of objectives, making it hard to communicate to non-technical stakeholders.
Assumes the Markov property (future is independent of past given the present), which may not hold for systems with memory or latent states.

Frequently asked

How does a Multi-objective Markov Model differ from a standard Markov Decision Process?

A standard MDP optimizes a single scalar reward signal, producing one optimal policy. A Multi-objective Markov Model optimizes a vector reward, producing a Pareto-optimal set of policies. The decision-maker applies preferences after seeing the trade-offs rather than embedding them in the reward function upfront.

What is the coverage set and why does it matter?

The coverage set is the minimal set of policies such that, for any utility function in a specified family (e.g., linear), at least one policy in the set is optimal. It gives decision-makers a compact representation of the Pareto front without needing to enumerate every dominated policy.

Can I use this method if I only have two objectives?

Yes — two objectives are the simplest multi-objective case and yield a one-dimensional Pareto front (a curve), which is easy to visualize and communicate. The method works for k >= 2 objectives, with computational cost growing with k.

What if transition probabilities are unknown?

If historical data are available, probabilities can be estimated via maximum likelihood or Bayesian methods. If data are scarce, robust or interval Markov models can bound the uncertainty. Without any probability estimates, scenario analysis or agent-based modeling may be more appropriate.

Is this the same as multi-objective reinforcement learning?

They are closely related but not identical. Multi-objective RL learns policies through interaction with an environment when the model is unknown, while a Multi-objective Markov Model assumes the transition and reward structure is specified in advance. MORL can be viewed as solving an MOMDP when the model must be learned.

Sources

Roijers, D. M., Vamplew, P., Whiteson, S., & Dazeley, R. (2013). A survey of multi-objective sequential decision-making. Journal of Artificial Intelligence Research, 48, 67–113. DOI: 10.1613/jair.3987 ↗
Chatterjee, K., Majumdar, R., & Henzinger, T. A. (2006). Markov decision processes with multiple objectives. In Proceedings of STACS 2006, Lecture Notes in Computer Science, vol. 3884, pp. 325–336. Springer, Berlin. DOI: 10.1007/11672142_26 ↗

How to cite this page

ScholarGate. (2026, June 3). Multi-objective Markov Decision Process Model. ScholarGate. https://scholargate.app/en/simulation/multi-objective-markov-model

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Markov ModelSimulation↔ compare
Multi-objective dynamic programmingSimulation↔ compare
Multi-Objective OptimizationSimulation↔ compare
Stochastic Dynamic ProgrammingSimulation↔ compare
Stochastic Markov ModelSimulation↔ compare

Compare side by side →

Related reference concepts

Sequential Decision Making (MDPs)Markov Decision Processes Reinforcement Learning Decision Theory and Utility Policy Gradient Methods Hidden Markov Models

Spotted an issue on this page? Report or suggest a fix →

Multi-objective Markov Model — Sequential Decision-Making Across Competing Objectives

Multi-objective Markov Decision Process Model · Also known as: MOMDP, Multi-objective MDP, Multi-criteria Markov Decision Process, MO-Markov Model

Tools & resources

Download slides

Learn & explore

Read the full method

Members only

When to use it

Strengths & limitations

Strengths

Explicitly models stochastic dynamics, capturing how uncertainty propagates over sequential decisions rather than treating it as noise.
Produces a Pareto-optimal policy set, preserving the trade-off structure so decision-makers can apply preferences after, not before, solving the model.
Scales to long planning horizons through dynamic programming, avoiding combinatorial explosion over time steps.
Supports heterogeneous objectives (cost in dollars, risk in probability, throughput in units) without requiring a common scale.
Compatible with sensitivity analysis on preference weights, making it transparent for stakeholder deliberation.

Limitations

State space explosion: even moderate systems can have millions of states, making exact Pareto Value Iteration computationally prohibitive without approximation.
Requires accurate transition probability estimates, which are difficult to obtain for novel or data-sparse systems.
The Pareto front can be exponentially large in the number of objectives, making it hard to communicate to non-technical stakeholders.
Assumes the Markov property (future is independent of past given the present), which may not hold for systems with memory or latent states.

Frequently asked

How does a Multi-objective Markov Model differ from a standard Markov Decision Process?

What is the coverage set and why does it matter?

Can I use this method if I only have two objectives?

What if transition probabilities are unknown?

Is this the same as multi-objective reinforcement learning?

Sources

Roijers, D. M., Vamplew, P., Whiteson, S., & Dazeley, R. (2013). A survey of multi-objective sequential decision-making. Journal of Artificial Intelligence Research, 48, 67–113. DOI: 10.1613/jair.3987 ↗
Chatterjee, K., Majumdar, R., & Henzinger, T. A. (2006). Markov decision processes with multiple objectives. In Proceedings of STACS 2006, Lecture Notes in Computer Science, vol. 3884, pp. 325–336. Springer, Berlin. DOI: 10.1007/11672142_26 ↗

How to cite this page

ScholarGate. (2026, June 3). Multi-objective Markov Decision Process Model. ScholarGate. https://scholargate.app/en/simulation/multi-objective-markov-model

Multi-objective Markov Model — Sequential Decision-Making Across Competing Objectives

Read the full method

Method map

When to use it

Strengths & limitations

Frequently asked

Sources

How to cite this page

Which method?

Similar methods

Related reference concepts

Multi-objective Markov Model — Sequential Decision-Making Across Competing Objectives

Read the full method

Method map

When to use it

Strengths & limitations

Frequently asked

Sources

How to cite this page

Which method?

Similar methods

Related reference concepts