AI Glossary: What Is Double Descent (DD)? Definition & Meaning

Double Descent

Double descent is a concept in machine learning that describes a non-traditional behavior of model performance as a function of model complexity. Traditionally, it was believed that as the complexity of a model increased, the training error would decrease while the validation error would initially decrease and then increase, leading to a U-shaped curve. This behavior was consistent with the bias-variance tradeoff.

However, recent research has revealed a more nuanced scenario known as double descent. In this framework, after the validation error increases due to overfitting, it can decrease again as the model complexity continues to rise. This results in a second descent in the validation error, leading to a ‘double descent’ curve. This means that for certain models, particularly deep neural networks, increasing the number of parameters can lead to better generalization performance even after reaching a point where the model appears to be overfitting.

Double descent challenges the conventional wisdom about model selection and complexity, suggesting that larger models might be more advantageous than previously thought, particularly in high-dimensional spaces where data is abundant. Understanding double descent is crucial for practitioners aiming to optimize model performance and avoid pitfalls related to overfitting.