AI Glossary: What Is Gradient Masking (GM)? Definition & Meaning

Masquage par gradient is a defensive technique employed in apprentissage automatique to mitigate the vulnerability of models against attaques adverses. Adversarial attacks involve making small, often imperceptible perturbations to input data that can drastically alter the model’s predictions. These attacks exploit the gradients, or the derivatives of the fonction de perte, which indicate how sensitive the model’s output is to changes in its l'entrée.

Dans le masquage de gradient, le modèle est conçu de manière à ce que les gradients deviennent moins informatifs ou trompeurs pour les attaquants potentiels. Cela peut être réalisé par diverses méthodes, notamment :

Ajout de bruit : Introducing random noise to the gradients can obscure the true direction and magnitude of updates that an adversary might use pour générer des exemples adverses.
Utilisation de fonctions non différentiables : Implementing components that are not differentiable can make it difficult for attackers to compute gradients accurately.
Obfuscation des gradients : Modifying the loss function in a way that hides the true gradients can create a false sense of security.

While gradient masking can provide a temporary shield against certain types of attacks, it is important to note that it does not offer a foolproof solution. Skilled adversaries may still find ways to exploit masked gradients, leading to a phenomenon known as ‘adversarial training’, where models are trained on both clean and adversarial examples to improve their robustness.

Overall, while gradient masking is a valuable tool in the arsenal of defenses against adversarial attacks, it should be used in conjunction with other strategies to ensure a more comprehensive approach to sécurité du modèle.