A 完全参照メトリック is a quantitative measure used to assess the performance of AIモデル, particularly in tasks such as 画像処理, 自然言語処理, and audio analysis. Unlike other metrics that might rely on partial data or subjective assessments, full reference metrics use complete and accurate outputs as a benchmark for comparison.
These metrics require a ‘ground truth’ dataset—an established set of correct outputs against which the AI model’s predictions are evaluated. For instance, in 画像品質評価, the original high-quality image serves as the reference, and the AI-generated image is compared to it. Common full reference metrics include:
- ピーク信号対雑音比(PSNR): Measures the ratio between the maximum possible power of a signal and the power of corrupting noise.
- 構造類似性指数(SSIM): Assesses the visual impact of three characteristics of an image: luminance, contrast, and structure.
- 単語誤り率(WER): Used in 音声認識, this metric evaluates the number of incorrect words in the recognized text compared to the reference text.
Full Reference Metrics are particularly valuable because they provide a clear, objective standard for モデル性能の評価. However, they are only applicable when a comprehensive ground truth is available, which may not always be the case in real-world scenarios. Consequently, while these metrics are essential for benchmarking AI performance in controlled environments, researchers and practitioners must also consider other evaluation methods, especially in situations where ground truth data is limited or unavailable.