AI Glossary: What Is SELU Activation? Definition & Meaning

Die skalierte exponentielle lineare Einheit (SELU) ist eine Aktivierungsfunktion used in neuronale Netze, particularly in Deep Learning models. It was introduced to help address issues of vanishing and explodierenden Gradienten zu beheben that can occur during training. The SELU function is defined mathematically as follows:

Für eine Eingabe x, the output f(x) ist:

f(x) = λ * (x if x > 0 else α * (exp(x) – 1))

wobei:

λ (lambda) ist ein Skalierungsfaktor, der typischerweise auf etwa 1,0507 eingestellt ist.
α (alpha) ist ein Parameter, der normalerweise bei etwa 1,6733 liegt.

SELU has a unique property of self-normalization, meaning that when used appropriately in a network, it helps maintain the mean and variance of the activations close to zero and one, respectively. This property facilitates faster convergence during training and can improve overall Modellleistung.

To effectively use SELU, it is recommended to initialize the weights of the neural network using the LeCun normal initialization method and to avoid dropout layers, as SELU is designed to work best in fully connected architectures without such Regularisierungstechniken.

Overall, the SELU activation function is particularly beneficial for deep networks, as it helps stabilize the training process and can lead to better generalization auf ungesehene Daten.