Convergence Analysis of Stochastic Gradient Descent with Adaptive Learning Rates: A Mathematical Framework

Authors

  • Sohail Ahmed Memon* Department of Mathematics, Shah Abdul Latif University, Khairpur Mirs. Email: suhail.memon@salu.edu.pk
  • Imtiaz Ahmed Shar Department of Mathematics, Shah Abdul Latif University, Khairpur Mirs. Email: sharimtiaz2014@gmail.com
  • Ghulam Muhammad Department of Mathematics, Shah Abdul Latif University, Khairpur Mirs. Email: gm.bhangu@gmail.com

DOI:

https://doi.org/10.63163/jpehss.v4i1.1072

Keywords:

Stochastic Gradient Descent, Adaptive Learning Rates, Neural Networks, Machine Learning, Deep Learning, Non-Convex Optimization

Abstract

Neural networks are growing and shaping tech industry very rapidly, especially the deep neural networks have been employed to a wide variety of AI applications. Stochastic Gradient Descent (SGD) is one of the algorithms used in deep neural networks. As an optimization method, the SGD using adaptive learning rates is used as choice of training deep neural networks. In spite of its widespread popularity, the deep understanding of its convergence properties for adaptive methods remains imperfect. This study provides a mathematical framework to analyze the convergence of adaptive variations of SGD which comprise RMSprop, AdaGrad, and Adam. This work focusses on establishing the convergence rates within various assumptions targeting the objective function. These assumptions non-convex settings for deep learning. Our study discloses the importance of second-moment accumulation in variance reduction and reveal explicit error bounds. We show that in suitable conditions, adaptive methods attain O(1\/√T) convergence rate for non-convex objectives and O(1\/T) for strongly convex functions. Along with proofs, we implement the numerical experiments which validate our theoretical findings. The results show a significant mathematical justification for selecting choices in adaptive optimizers and provide better approaches for hyperparameter tuning.

Downloads

Published

2025-09-30

How to Cite

Convergence Analysis of Stochastic Gradient Descent with Adaptive Learning Rates: A Mathematical Framework. (2025). Physical Education, Health and Social Sciences, 3(3), 133-142. https://doi.org/10.63163/jpehss.v4i1.1072

Similar Articles

131-140 of 266

You may also start an advanced similarity search for this article.