Perte Histogramme
La perte d'histogramme est une métrique utilisé en apprentissage automatique, particularly in classification tasks, to evaluate the performance of models by comparing the predicted probability distribution of classes to the actual distribution of classes in the dataset. Unlike traditional des fonctions de perte that focus on individual predictions, Histogram Loss takes a broader view by assessing the distribution globale de prédictions.
Dans de nombreux problèmes de classification, en particulier ceux avec jeux de données déséquilibrés, it is crucial not just to classify individual instances correctly but also to ensure that the predicted probabilities reflect the true distribution of classes. For instance, if a model predicts a class probability distribution that is significantly different from the actual distribution, it indicates a potential failure in the model’s understanding of the data.
Le calcul de la perte histogramme implique les étapes suivantes :
- Regrouper les prédictions en intervalles : The predicted probabilities are divided into discrete bins, creating a histogram that summarizes the predicted distribution.
- Calculer l'histogramme pour les données réelles : Similarly, the actual class labels are converted into a histogram representing the true distribution.
- Comparer les distributions : The Histogram Loss is computed by comparing the predicted histogram to the actual histogram, often using methods such as divergence de Kullback-Leibler or Earth Mover’s Distance.
By focusing on the overall distribution rather than individual predictions, Histogram Loss provides a more nuanced view of performance du modèle, especially in scenarios where class distributions are skewed or where certain classes may be underrepresented.
En conséquence, la perte d'histogramme est particulièrement précieuse dans des applications telles que classification multi-classes, where understanding the distribution of predictions is critical for model evaluation and improvement.