AI Glossary: What Is Evaluation Metric (EM)? Definition & Meaning

An métrica de avaliação is a standard used to assess the performance of an inteligência artificial (AI) model. These metrics provide a quantitative measure that helps researchers and developers determine how well their model is performing in relation to its intended task. Different types of tasks require different metrics, as the criteria for success can vary greatly depending on the application.

Os mais comuns métricas de avaliação incluem:

Precisão: The proportion of correct predictions made by the model out of all predictions. This metric is widely used in classification tarefas.
Precisão: The ratio of true positive predictions to the total predicted positives, indicating how many of the identified positive instances are actually correct.
Recordar (Sensibilidade): The ratio of true positive predictions to the total actual positives, highlighting the model’s ability to identify all relevant instances.
Pontuação F1: The harmonic mean of precision and recall, providing a balance between the two metrics, especially important in cases of desequilíbrio de classes.
Erro Quadrático Médio (MSE): A common metric for regression tasks, measuring the average of the squares of the errors—that is, the average squared difference between predicted and actual values.

Choosing the right evaluation metric is crucial, as it can significantly influence how a model is optimized and interpreted. For instance, a high accuracy might be misleading in cases of class imbalance, where a model could achieve high accuracy by simply predicting the classe majoritária. Therefore, understanding the context and the specific requirements of the task is essential when selecting appropriate evaluation metrics.