A conexión residual is a technique used in aprendizaje profundo, particularly in redes neuronales, to help improve their training and performance. The concept was popularized by the ResNet (Residual Network) architecture, which won the ImageNet competencia en 2015.
In a typical neural network, data flows sequentially through layers, where each layer applies certain transformations. However, as networks become deeper (with more layers), they can experience issues such as gradientes que desaparecen, where the gradients used to update weights during training become very small, hindering learning.
Las conexiones residuales abordan este problema permitiendo que la entrada a una capa eluda una o más capas y se añada directamente a la salida de esas capas. Esto se representa matemáticamente como:
Salida = F(Entrada) + Entrada
Here, F(Input) represents the transformation applied by the layers being bypassed. By including the original input in the output, residual connections help maintain the flow of information and gradients, making it easier for the network to learn complex patrones.
These connections also allow for the training of much deeper networks, leading to better performance on various tasks like image recognition, procesamiento de lenguaje natural, and more. Overall, residual connections are a crucial innovation in modern deep learning, facilitating the development of more sophisticated AI models.