Explore 7 AI terms in Evaluation Metrics
BLEU Score is a metric for evaluating the quality of text generated by AI, comparing it to reference translations.
CIDEr is a metric used to evaluate the quality of image captions by comparing them to human-written references.
GIFA Loss is a metric used to evaluate generative models based on their ability to generate realistic samples.
Intersection over Union (IoU) measures the overlap between two bounding boxes in object detection.
Perplexity is a measurement used to evaluate the performance of language models.
Precision refers to the accuracy and consistency of AI model predictions.
A safety benchmark is a standard used to evaluate the safety performance of AI systems.