AI Glossary: What Is Evaluation Metric (EM)? Definition & Meaning

An métrica de evaluación is a standard used to assess the performance of an inteligencia artificial (AI) model. These metrics provide a quantitative measure that helps researchers and developers determine how well their model is performing in relation to its intended task. Different types of tasks require different metrics, as the criteria for success can vary greatly depending on the application.

Común métricas de evaluación incluyen:

Precisión: The proportion of correct predictions made by the model out of all predictions. This metric is widely used in classification tareas.
Precisión: The ratio of true positive predictions to the total predicted positives, indicating how many of the identified positive instances are actually correct.
Recordar (Sensibilidad): The ratio of true positive predictions to the total actual positives, highlighting the model’s ability to identify all relevant instances.
Puntuación F1: The harmonic mean of precision and recall, providing a balance between the two metrics, especially important in cases of desequilibrio de clases.
Error cuadrático medio (MSE): A common metric for regression tasks, measuring the average of the squares of the errors—that is, the average squared difference between predicted and actual values.

Choosing the right evaluation metric is crucial, as it can significantly influence how a model is optimized and interpreted. For instance, a high accuracy might be misleading in cases of class imbalance, where a model could achieve high accuracy by simply predicting the clase mayoritaria. Therefore, understanding the context and the specific requirements of the task is essential when selecting appropriate evaluation metrics.