AI Glossary: What Is SELU Activation? Definition & Meaning

A Unidade Linear Exponencial Escalada (SELU) é uma função de ativação used in redes neurais, particularly in aprendizado profundo models. It was introduced to help address issues of vanishing and gradientes que explodem that can occur during training. The SELU function is defined mathematically as follows:

Para uma entrada x, the output f(x) é:

f(x) = λ * (x if x > 0 else α * (exp(x) – 1))

onde:

λ (lambda) é um fator de escala, geralmente definido em aproximadamente 1,0507.
α (alpha) é um parâmetro, normalmente em torno de 1,6733.

SELU has a unique property of self-normalization, meaning that when used appropriately in a network, it helps maintain the mean and variance of the activations close to zero and one, respectively. This property facilitates faster convergence during training and can improve overall desempenho do modelo.

To effectively use SELU, it is recommended to initialize the weights of the neural network using the LeCun normal initialization method and to avoid dropout layers, as SELU is designed to work best in fully connected architectures without such técnicas de regularização.

Overall, the SELU activation function is particularly beneficial for deep networks, as it helps stabilize the training process and can lead to better generalization em dados não vistos.