SGD yenye Momentum / Adam Optimizer
Stochastic Gradient Descent (SGD) yenye momentum na mrithi wake anayebadilika-badilika Adam ni algoriti za msingi za kusasisha vigezo zinazotumiwa kufunza karibu kila modeli ya kisasa ya deep learning. Momentum SGD ilifanywa rasmi na Polyak (1964) na kuletwa katika mafunzo ya neural network na Rumelhart, Hinton, na Williams (1986). Adam, iliyoanzishwa na Kingma na Ba katika ICLR 2015, ilipanua wazo la momentum kwa pia kudumisha wastani unaoendelea wa gradients zilizopimwa, ikitoa viwango vya kujifunza vinavyobadilika kwa kila kigezo ambavyo huifanya kuwa optimizer chaguo-msingi katika mazoezi ya kisasa ya deep learning.
Soma mbinu kamili
Ingia kwa akaunti ya bure ili kusoma sehemu hii.
Ramani ya mbinu
Jirani ya mbinu zinazohusiana — chagua nodi ili kuchunguza.
Vyanzo
- Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. International Conference on Learning Representations (ICLR 2015). arXiv:1412.6980. link ↗
- Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323, 533–536. DOI: 10.1038/323533a0 ↗
- Polyak, B. T. (1964). Some methods of speeding up the convergence of iteration methods. USSR Computational Mathematics and Mathematical Physics, 4(5), 1–17. DOI: 10.1016/0041-5553(64)90137-5 ↗
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning (Ch. 8: Optimization for Training Deep Models). MIT Press. ISBN: 978-0-262-03561-3
Jinsi ya kunukuu ukurasa huu
ScholarGate. (2026, June 3). Stochastic Gradient Descent with Momentum and Adaptive Moment Estimation (Adam). ScholarGate. https://scholargate.app/sw/deep-learning/stochastic-gradient-descent-with-momentum-adam-optimizer
Mbinu ipi?
Weka mbinu hii kando ya jamaa zake wa karibu na uzisome bega kwa bega — maktaba huweka vitabu mezani; uamuzi ni wako.
- Batch NormalizationUjifunzaji wa Kina↔ linganisha
Similar methods
Umeona tatizo kwenye ukurasa huu? Ripoti au pendekeza marekebisho →