AI Glossary: What Is Minibatch SGD? Definition & Meaning

Descente de gradient stochastique par mini-lots (SGD)

Stochastique par mini-lots Descente de gradient (SGD) is an algorithme d'optimisation used in l'entraînement de modèles d'apprentissage automatique. It is a variant of the traditional gradient descent method, which aims to minimize the loss function by updating model parameters iteratively based on the gradient of the loss.

In standard gradient descent, the model parameters are updated using the entire training dataset, which can be computationally expensive and slow, especially for large datasets. In contrast, SGD updates the parameters using only a single data point at a time, leading to faster updates but with high variability. To strike a balance between these two extremes, minibatch SGD uses small random subsets (or ‘minibatches’) of the données d'entraînement pour chaque mise à jour.

The key advantages of minibatch SGD include improved convergence rates and reduced computation time. By using minibatches, the algorithm can exploit the benefits of both full-batch and stochastic gradient descent. The minibatch size is a hyperparameter that can be adjusted; common sizes range from 32 to 256 samples, depending on the dataset and architecture du modèle.

La SGD par mini-lots introduit également un peu de bruit dans le gradient estimation, which can help the optimization escape local minima and potentially lead to better overall solutions. However, care must be taken in choosing the appropriate minibatch size, as too small a size can lead to noisy updates, while too large a size may negate the benefits of stochasticity.

Overall, minibatch SGD is a cornerstone technique in training deep learning models and is widely used in various applications, from image recognition to traitement du langage naturel.