Inter-Annotator Agreement (IAA) is a statistical measure used to assess the level of agreement or consistency among two or more annotators who are labeling or tagging data in a given dataset. It is particularly important in fields such as processamento de linguagem natural, image recognition, and other areas of inteligência artificial where human judgment is involved in anotação de dados.
Quando múltiplos anotadores avaliam os mesmos dados, o IAA ajuda a quantificar o quanto suas anotações convergem ou divergem. Um alto nível de concordância sugere que os anotadores interpretam os dados de maneira semelhante, indicando confiabilidade no processo de rotulagem. Por outro lado, baixa concordância pode destacar ambiguidades nos dados ou inconsistências na compreensão dos anotadores.
Os mais comuns metrics usados para calcular o IAA incluem:
- Cohen’s Kappa: Measures agreement between two annotators, accounting for the possibility of agreement occurring by chance.
- Fleiss’ Kappa: An extension of Cohen’s Kappa for more than two annotators, providing a way to measure agreement across multiple raters.
- Krippendorff’s Alpha: A versatile measure that can be used for any number of annotators and different types of data (nominal, ordinal, interval).
In practice, achieving a high IAA is crucial for ensuring the quality and reliability of data used for treinar modelos de aprendizado de máquina. Low IAA can lead to biases in model predictions, as the model may learn from inconsistent or poorly labeled data. Therefore, researchers and practitioners often conduct IAA assessments during the annotation process to refine guidelines, train annotators, and improve data quality.