AI Glossary: What Is SELU Activation? Definition & Meaning

L'Unité Linéaire Exponentielle Échelonnée (SELU) est un fonction d'activation used in réseaux neuronaux, particularly in apprentissage profond models. It was introduced to help address issues of vanishing and gradients explosifs that can occur during training. The SELU function is defined mathematically as follows:

Pour une entrée x, the output f(x) est :

f(x) = λ * (x if x > 0 else α * (exp(x) – 1))

où :

λ (lambda) est un facteur de mise à l'échelle, généralement fixé à environ 1,0507.
α (alpha) est un paramètre, habituellement autour de 1,6733.

SELU has a unique property of self-normalization, meaning that when used appropriately in a network, it helps maintain the mean and variance of the activations close to zero and one, respectively. This property facilitates faster convergence during training and can improve overall performance du modèle.

To effectively use SELU, it is recommended to initialize the weights of the neural network using the LeCun normal initialization method and to avoid dropout layers, as SELU is designed to work best in fully connected architectures without such techniques de régularisation.

Overall, the SELU activation function is particularly beneficial for deep networks, as it helps stabilize the training process and can lead to better generalization sur des données non vues.