Delta de Huber
El Huber Delta, a menudo referido simplemente como pérdida de Huber, is a popular función de pérdida used in análisis de regresión within aprendizaje automático. It combines the best properties of two other funciones de pérdida: the Error cuadrático medio (MSE) and the error absoluto medio (MAE). The primary purpose of the Huber loss is to provide robustness against outliers in data sets.
En términos más técnicos, la función de pérdida de Huber se define como:
L(delta) = { 0.5 * (delta)^2, if |delta| <= delta_threshold
k * (|delta| – 0.5 * delta_threshold), otherwise }
Here, delta represents the difference between the predicted value and the actual value, while delta_threshold is a parameter that determines the point at which the loss function transitions from quadratic to linear. When the error (delta) is smaller than the threshold, the function behaves like MSE, which is sensitive to small errors. When the error exceeds the threshold, it behaves like MAE, which is linear and less sensitive to outliers.
The advantage of using Huber loss is that it effectively reduces the influence of outliers on the overall loss calculation, allowing models to achieve better performance on data sets that may contain noisy measurements. Consequently, it is widely used in various regression tasks, especially in scenarios where la integridad de los datos no se puede garantizar.
In practice, selecting the appropriate delta_threshold is crucial, as it controls the sensitivity of the loss function to outliers. A smaller threshold makes the loss function more robust to outliers, while a larger threshold behaves more like MSE.