AI Glossary: What Is Gradient Variance (GV)? Definition & Meaning

Variance du gradient refers to the variability or spread of the gradients computed during the training of apprentissage automatique models, particularly in the context of les algorithmes d'optimisation such as stochastic algorithme de descente de gradient (SGD).

In apprentissage profond, the training process involves adjusting the weights of the model to minimize the fonction de perte. This is done by calculating the gradients of the loss with respect to the model parameters. However, when using mini-batches of data, the gradients can vary significantly from one batch to another due to the randomness in the data selection. This variability is referred to as gradient variance.

Une variance élevée du gradient peut entraîner une instabilité dans le processus d'entraînement. Par exemple, si les gradients fluctuent largement, le modèle peut ne pas converger correctement, ce qui conduit à une performance sous-optimale. En revanche, une faible variance du gradient indique des mises à jour plus cohérentes des paramètres du modèle, ce qui peut aider à une convergence plus rapide.

To address the issues caused by high gradient variance, practitioners often implement techniques such as coupure du gradient, which limits the size of the gradients, or use more advanced optimizers that adapt the learning rate based on the variance of the gradients. Understanding and managing gradient variance is crucial for effectively training deep learning models and achieving desired performance levels.