ScholarGate
Msaidizi
Machine learningReinforcement learning

Q-Learning

Q-learning, iliyoanzishwa na Christopher Watkins na Peter Dayan mwaka 1992, ni algorithm ya kujifunza kwa kuimarisha bila mfumo ambayo hujifunza thamani ya kuchukua kila hatua katika kila hali — kazi ya Q — kutoka kwa uzoefu tu, bila mfumo wa mazingira. Ni nje ya sera: hujifunza maadili bora ya hatua wakati ikifuata sera ya tabia ya uchunguzi, na chini ya hali za kawaida inathibitisha kufikia sera bora.

Fungua katika MethodMindHivi karibuniVideoHivi karibuniDownload slides

Soma mbinu kamili

Kwa wanachama pekee

Ingia kwa akaunti ya bure ili kusoma sehemu hii.

Ingia

Method map

The neighbourhood of related methods — select a node to explore.

Vyanzo

  1. Watkins, C. J. C. H., & Dayan, P. (1992). Q-learning. Machine Learning, 8(3–4), 279–292. DOI: 10.1007/BF00992698
  2. Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd ed.). MIT Press. ISBN: 978-0-262-03924-6

Jinsi ya kunukuu ukurasa huu

ScholarGate. (2026, June 2). Q-Learning (Off-Policy Temporal-Difference Control). ScholarGate. https://scholargate.app/sw/machine-learning/q-learning

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side

Imerejelewa na

ScholarGateQ-Learning (Q-Learning (Off-Policy Temporal-Difference Control)). Imepatikana 2026-06-15 kutoka https://scholargate.app/sw/machine-learning/q-learning · Seti ya data: https://doi.org/10.5281/zenodo.20539026