Équivariance is a mathematical property often encountered in the fields of apprentissage automatique and vision par ordinateur, particularly in the context of réseaux neuronaux and transformations of data. A function is said to be equivariant if it commutes with a transformation, meaning that applying a transformation to the input and then passing it through the function yields the same result as first passing the input through the function and then applying the transformation to the output.
Plus formellement, si nous avons une fonction f and a transformation T, we say that f is equivariant to T si :
f(T(x)) = T(f(x))
pour tout x en entrée x. This property is particularly useful in scenarios where the input data can be transformed in various ways, such as rotating, scaling, or translating images. In such cases, maintaining equivariance ensures that the function’s output remains consistent and predictable despite changes to the input. This is critical in applications like image recognition, where the position or orientation of an object should not affect the ability of the model to recognize it.
Dans le contexte des réseaux neuronaux, l'équivariance est souvent intégrée dans la architecture of réseaux de neurones convolutifs (CNNs), where the opération de convolution is designed to be equivariant to translations. This means that if an image is shifted, the feature maps produced by the CNN will shift accordingly, preserving the spatial information necessary for effective learning and inference.
Equivariance can also extend to other transformations, such as rotations and reflections, and is a foundational concept in various areas of AI research, including invariant extraction de caractéristiques et la symétrie dans les modèles d'apprentissage.