Process / pipelineSimulation / optimization

多目标马尔可夫模型 — 跨竞争目标的序贯决策

多目标马尔可夫模型 (Multi-objective Markov Model, MOMDP) 将经典的马尔可夫决策过程 (Markov Decision Process, MDP) 扩展到代理需要同时优化多个奖励信号的场景。该模型不产生单一的最优策略，而是生成一个帕累托最优策略集，使决策者能够随着时间推移在成本、风险和吞吐量等竞争目标之间进行权衡。

在 MethodMind 中打开即将推出视频即将推出Download slides

阅读完整方法

仅限会员

使用免费账户登录即可阅读本节。

Method map

The neighbourhood of related methods — select a node to explore.

多目标马尔可夫模型

马尔可夫模型多目标动态规划多目标优化随机动态规划随机马尔可夫模型

来源

Roijers, D. M., Vamplew, P., Whiteson, S., & Dazeley, R. (2013). A survey of multi-objective sequential decision-making. Journal of Artificial Intelligence Research, 48, 67–113. DOI: 10.1613/jair.3987 ↗
Chatterjee, K., Majumdar, R., & Henzinger, T. A. (2006). Markov decision processes with multiple objectives. In Proceedings of STACS 2006, Lecture Notes in Computer Science, vol. 3884, pp. 325–336. Springer, Berlin. DOI: 10.1007/11672142_26 ↗

如何引用本页

ScholarGate. (2026, June 3). Multi-objective Markov Decision Process Model. ScholarGate. https://scholargate.app/zh/simulation/multi-objective-markov-model

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side →

发现本页有问题？报告或提出修改建议 →

阅读完整方法

Method map

来源

如何引用本页

相关方法

Which method?