A Métrique de référence complète is a quantitative measure used to assess the performance of modèles d'IA, particularly in tasks such as traitement d'image, traitement du langage naturel, and audio analysis. Unlike other metrics that might rely on partial data or subjective assessments, full reference metrics use complete and accurate outputs as a benchmark for comparison.
These metrics require a ‘ground truth’ dataset—an established set of correct outputs against which the AI model’s predictions are evaluated. For instance, in évaluation de la qualité d'image, the original high-quality image serves as the reference, and the AI-generated image is compared to it. Common full reference metrics include:
- Rapport Signal sur Bruit de Pic (PSNR) : Measures the ratio between the maximum possible power of a signal and the power of corrupting noise.
- Indice de Similarité Structurelle (SSIM) : Assesses the visual impact of three characteristics of an image: luminance, contrast, and structure.
- Taux d’Erreur de Mot (WER): Used in reconnaissance vocale, this metric evaluates the number of incorrect words in the recognized text compared to the reference text.
Full Reference Metrics are particularly valuable because they provide a clear, objective standard for l'évaluation des performances du modèle. However, they are only applicable when a comprehensive ground truth is available, which may not always be the case in real-world scenarios. Consequently, while these metrics are essential for benchmarking AI performance in controlled environments, researchers and practitioners must also consider other evaluation methods, especially in situations where ground truth data is limited or unavailable.