A Vollständige Referenzmetrik is a quantitative measure used to assess the performance of KI-Modelle, particularly in tasks such as der Bildverarbeitung, der Verarbeitung natürlicher Sprache, and audio analysis. Unlike other metrics that might rely on partial data or subjective assessments, full reference metrics use complete and accurate outputs as a benchmark for comparison.
These metrics require a ‘ground truth’ dataset—an established set of correct outputs against which the AI model’s predictions are evaluated. For instance, in Bildqualitätsbewertung, the original high-quality image serves as the reference, and the AI-generated image is compared to it. Common full reference metrics include:
- Peak Signal-to-Noise Ratio (PSNR): Measures the ratio between the maximum possible power of a signal and the power of corrupting noise.
- Strukturelle Ähnlichkeitsindex (SSIM): Assesses the visual impact of three characteristics of an image: luminance, contrast, and structure.
- Wortfehlerrate (WER): Used in Spracherkennung, this metric evaluates the number of incorrect words in the recognized text compared to the reference text.
Full Reference Metrics are particularly valuable because they provide a clear, objective standard for Bewertung der Modellleistung. However, they are only applicable when a comprehensive ground truth is available, which may not always be the case in real-world scenarios. Consequently, while these metrics are essential for benchmarking AI performance in controlled environments, researchers and practitioners must also consider other evaluation methods, especially in situations where ground truth data is limited or unavailable.