ScholarGate
Assistent
Process / pipelineSimulation / optimization

Bayesiansk dynamisk programmering — Sekventiel beslutningsoptimering med Bayesiansk tro-opdatering

Bayesiansk dynamisk programmering (BDP) kombinerer Bellmans ramme for dynamisk programmering med Bayesiansk inferens for at optimere sekventielle beslutninger, når overgangssandsynligheder eller belønningsstrukturer er ukendte. På hvert trin opdaterer agenten sin tro om miljøet ved hjælp af observerede udfald og beregner derefter en optimal politik, der eksplicit tager højde for både umiddelbare belønninger og værdien af information opnået gennem udforskning.

Åbn i MethodMindSnartVideoSnartDownload slides

Læs hele metoden

Kun for medlemmer

Log ind med en gratis konto for at læse dette afsnit.

Log ind

Method map

The neighbourhood of related methods — select a node to explore.

Kilder

  1. Bertsekas, D. P. (1995). Dynamic Programming and Optimal Control. Athena Scientific, Belmont, MA. ISBN: 9781886529267
  2. Duff, M. O. (2002). Optimal Learning: Computational procedures for Bayes-adaptive Markov decision processes. PhD Dissertation, University of Massachusetts Amherst. link

Sådan citerer du denne side

ScholarGate. (2026, June 3). Bayesian Dynamic Programming — Sequential decision optimization under uncertainty with Bayesian belief updating. ScholarGate. https://scholargate.app/da/simulation/bayesian-dynamic-programming

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side

Refereret af

ScholarGateBayesian Dynamic Programming (Bayesian Dynamic Programming — Sequential decision optimization under uncertainty with Bayesian belief updating). Hentet 2026-06-15 fra https://scholargate.app/da/simulation/bayesian-dynamic-programming · Datasæt: https://doi.org/10.5281/zenodo.20539026