AI Glossary: What Is L-Diversity? Definition & Meaning

L-Diversidade

L-Diversity é um privacy protection model used in the field of anonimização de dados, particularly in the context of relational databases. The primary objective of L-Diversity is to enhance the privacy of individuals whose data is included in a conjunto de dados ao garantir que atributos sensíveis sejam representados com diversidade suficiente.

In practice, L-Diversity operates by ensuring that for any group of records (known as an equivalence class), which share certain identifying characteristics (like age or gender), there are at least ‘L’ distinct sensitive values present. This means that if an attacker were to gain access to the data, they would find it difficult to pinpoint an individual’s sensitive information because there are multiple potential values that could apply.

Por exemplo, considere um database containing health records where the sensitive attribute is a medical condition. If a specific equivalence class contains three individuals, and all of them have the same condition (e.g., diabetes), this would violate the L-Diversity principle if L is set to 2, as there is not enough diversity to protect the privacy of those individuals. By ensuring that there are at least L different medical conditions represented in that group, L-Diversity helps mitigate the risk of re-identification of individuals.

While L-Diversity effectively combats certain types of attacks, such as homogeneity attacks, it is not foolproof. It can still be vulnerable to background knowledge attacks, where an adversary uses external information to make educated guesses about sensitive attributes. As such, researchers continue to develop and refine privacy preservation techniques to complement L-Diversity and enhance overall segurança de dados.