Leaky Rectified Linear Unit (Leaky ReLU) is an activation function commonly used in neural networks, particularly in deep learning. It is designed to address the problem of dying neurons, which can occur in standard ReLU (Rectified Linear Unit) functions where neurons become inactive and stop learning. The Leaky ReLU function allows a small, non-zero gradient when the input is negative, preventing neurons from becoming completely inactive.
The mathematical formula for Leaky ReLU is defined as:
f(x) = x if x > 0, else α * x
Here, α (alpha) is a small constant (typically 0.01) that determines the slope of the function for negative inputs. This means that for negative values, instead of being zero as in standard ReLU, the output will be a small, negative value, thus maintaining some level of activation even for non-positive inputs.
One of the advantages of Leaky ReLU is that it helps to mitigate the vanishing gradient problem, making it easier for models to learn and generalize. This property can lead to better performance in deep networks compared to traditional activation functions. However, the choice of alpha can affect the training dynamics, and it’s often recommended to experiment with this parameter during model tuning.
In summary, Leaky ReLU is a simple yet effective activation function that enhances the learning capacity of neural networks by allowing a small gradient for negative inputs, thereby keeping neurons active and contributing to the overall model performance.