Võrdle meetodeid
Vaata valitud meetodeid kõrvuti; erinevad read on esile tõstetud.
| Selgitatav tugevdamisõpe× | Tugevdamisõpe× | |
|---|---|---|
| Valdkond | Süvaõpe | Süvaõpe |
| Perekond | Machine learning | Machine learning |
| Tekkeaasta≠ | 2018–2020 | 1950s–1998 |
| Looja≠ | Puiutta, E. & Veith, E. M. S. P. (survey); broader XAI community | Sutton, R. S. & Barto, A. G. (formalised); Bellman, R. (foundations) |
| Tüüp≠ | Hybrid approach (RL + explainability methods) | Sequential decision-making framework |
| Algallikas≠ | Puiutta, E., & Veith, E. M. S. P. (2020). Explainable Reinforcement Learning: A Survey. In Machine Learning and Knowledge Extraction (CD-MAKE 2020), Lecture Notes in Computer Science, vol. 12279, pp. 77–95. Springer. DOI ↗ | Sutton, R. S. & Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd ed.). MIT Press. ISBN: 978-0-262-03924-6 |
| Rööpnimetused | XRL, interpretable reinforcement learning, transparent RL, explainable RL | RL, reward-based learning, trial-and-error learning, policy optimization |
| Seotud≠ | 3 | 2 |
| Kokkuvõte≠ | Explainable Reinforcement Learning (XRL) augments standard reinforcement learning agents with methods that make their policies, decisions, and learned behaviors interpretable to humans. Rather than treating the policy as a black box, XRL produces post-hoc explanations or builds inherently transparent policies, enabling trust verification, debugging, and accountability in high-stakes automated decision-making. | Reinforcement Learning (RL) is a framework in which an agent learns to make sequential decisions by interacting with an environment, receiving scalar reward signals, and updating a policy to maximise cumulative future reward. Unlike supervised learning, no labeled examples are provided; the agent discovers optimal behavior entirely through experience and delayed feedback. |
| ScholarGateAndmestik ↗ |
|
|