Perda de Huber
Perda de Huber is a popular função de perda used in regression problems, particularly in aprendizado de máquina and statistics. It combines the advantages of two other funções de perda: erro quadrático médio (MSE) and erro absoluto médio (MAE). Unlike MSE, which can be heavily influenced by outliers due to the squaring of errors, Huber Loss is designed to be robust against such anomalies.
A perda de Huber é definida por um parâmetro chamado limiar (frequentemente denotado como δ), which determines the point at which the loss function transitions from quadratic to linear. For residuals (the differences between actual and predicted values) that are less than δ em valor absoluto, a perda de Huber se comporta como MSE, usando a fórmula:
Perda de Huber = 0,5 * (resíduo)^2
Para resíduos que excedem δ, the loss is calculated using the absolute error formula, which is less sensitive to large errors:
Huber Loss = δ * (|residual| – 0.5 * δ)
Essa combinação permite que a Perda de Huber forneça um gradiente suave para optimization while limiting the influence of outliers. When selecting δ, it is important to consider the scale of the data and the specific characteristics of the dataset.
Huber Loss is particularly useful in scenarios where a dataset contains outliers that could skew the results if MSE were used exclusively. It strikes a balance between maintaining sensitivity to small errors and robustness against large deviations, making it a versatile choice for many regression applications.