AI Glossary: Benchmark Datasets Terms & Definitions

Flores-200

FLoRes 200

Flores-200 is a benchmark dataset used for evaluating AI models in natural language processing.

HS

HellaSwag is a benchmark dataset used to evaluate AI's understanding of humor and common sense reasoning.

KITTI

The KITTI Dataset is a benchmark dataset for computer vision, particularly for autonomous driving research.

SIQA

SocialIQA is a benchmark dataset for evaluating AI's understanding of social interactions and reasoning.

STS-B is a benchmark dataset used for evaluating sentence similarity in natural language processing tasks.