AI Glossary: What Is Evaluation Metric (EM)? Definition & Meaning

An métrique d’évaluation is a standard used to assess the performance of an intelligence artificielle (AI) model. These metrics provide a quantitative measure that helps researchers and developers determine how well their model is performing in relation to its intended task. Different types of tasks require different metrics, as the criteria for success can vary greatly depending on the application.

Courant métriques d’évaluation incluent :

Précision : The proportion of correct predictions made by the model out of all predictions. This metric is widely used in classification tâches.
Précision: The ratio of true positive predictions to the total predicted positives, indicating how many of the identified positive instances are actually correct.
Rappel (Sensibilité) : The ratio of true positive predictions to the total actual positives, highlighting the model’s ability to identify all relevant instances.
Score F1 : The harmonic mean of precision and recall, providing a balance between the two metrics, especially important in cases of déséquilibre des classes.
Erreur quadratique moyenne (MSE) : A common metric for regression tasks, measuring the average of the squares of the errors—that is, the average squared difference between predicted and actual values.

Choosing the right evaluation metric is crucial, as it can significantly influence how a model is optimized and interpreted. For instance, a high accuracy might be misleading in cases of class imbalance, where a model could achieve high accuracy by simply predicting the classe majoritaire. Therefore, understanding the context and the specific requirements of the task is essential when selecting appropriate evaluation metrics.