Machine learningTraining techniques
Adversarial Training
Adversarial Training is a robust optimization procedure for deep neural networks in which the model is trained not on clean data alone but on worst-case perturbed inputs crafted during training. Formalized by Madry et al. (2018) as a min-max saddle-point problem, the method uses Projected Gradient Descent (PGD) to generate strong adversarial examples within a bounded Lp perturbation set before each gradient update, forcing the network to learn decision boundaries that are stable under such perturbations.
Open in MethodMindSoonVideoSoon
Read the full method
Members only
Sign inSign in with a free account to read this section.
Sources
- Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. International Conference on Learning Representations (ICLR). link ↗