AI Glossary: What Is AdaMax? Definition & Meaning

AdaMax est un algorithme d'optimisation that is an extension of the optimiseur Adam, which is widely used in training apprentissage profond models. It is particularly effective for handling sparse gradients, making it suitable for a range of tasks in apprentissage automatique.

The key innovation of AdaMax lies in its use of the infinity norm (or max norm) rather than the L2 norm (Euclidean norm) used in Adam. This change allows AdaMax to stabilize the updates of model weights, which can be especially beneficial in scenarios where gradients may vary significantly, such as in tâches de traitement du langage naturel ou lors de la manipulation de données de haute dimension.

AdaMax maintains the adaptive learning rate feature of Adam, which adjusts the learning rate for each parameter based on the historical gradients. This adaptive mechanism helps in achieving faster convergence and can lead to better performance in training réseaux neuronaux. The algorithm computes first and second moments of the gradients, using them to update the parameters iteratively.

In practice, AdaMax can be particularly advantageous when the loss landscape is complex, as it helps to avoid oscillations that might occur with other les algorithmes d'optimisation. It’s implemented in many popular machine learning frameworks, making it easily accessible for practitioners looking to improve their model training processes.