Process / pipeline

Stochastic Optimization — SGD and Variants

Stochastic optimization is a family of iterative methods that minimize an objective function by computing gradients on randomly sampled subsets of data — mini-batches — rather than on the entire dataset at once. Pioneered by Robbins and Monro in 1951 as stochastic approximation, the approach became the standard engine for training large-scale machine-learning models through variants such as SGD with momentum, AdaGrad, RMSProp, and Adam.

Open in MethodMindSoonVideoSoon

Read the full method

Members only

Sign in with a free account to read this section.

Sign in

Sources

  1. Robbins, H. & Monro, S. (1951). A Stochastic Approximation Method. Annals of Mathematical Statistics, 22(3), 400-407. DOI: 10.1214/aoms/1177729586
  2. Kingma, D.P. & Ba, J. (2015). Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (ICLR 2015). link

Related methods

Referenced by

ScholarGateStochastic Optimization (Stochastic Optimization (SGD and Variants)). Retrieved 2026-06-04 from https://scholargate.app/en/optimization/stochastic-optimization