AI Glossary: AI Ethics Terms & Definitions

Actor Network

An Actor Network is a concept in sociology that describes the interconnected relationships between human and non-human entities.

AI Risk

AI risk refers to potential negative consequences arising from the development and deployment of artificial intelligence systems.

Algorithmic Bias

Algorithmic bias refers to systematic and unfair discrimination in algorithmic decision-making processes.

Aligned AI

Aligned AI refers to artificial intelligence systems designed to align with human values and goals.

Alignment Tax

Alignment Tax refers to the additional costs incurred to ensure AI systems align with human values and ethics.

Alignment Taxonomy

AT

A framework categorizing AI systems based on their alignment with human values and intentions.

Anchoring Bias (AI)

Anchoring Bias in AI refers to the cognitive tendency to rely heavily on the first piece of information encountered.

Anthropic

Anthropic refers to concepts or principles related to human existence and the implications for AI safety and ethics.

Anthropic Uncertainty

Anthropic Uncertainty refers to the uncertainty about human preferences and values in AI system design.

Auditability

Auditability is the ability to verify and trace processes or data within a system for compliance and accountability.

Base Rate Fallacy

The base rate fallacy occurs when the base rate (prior probability) is ignored in favor of specific information.

Behavior Policy

BP

A Behavior Policy outlines the rules and expectations for acceptable conduct in AI systems.

Black Box Model

A Black Box Model is an AI system whose internal workings are not accessible or interpretable by users.

Claude 1

Claude 1 is an AI language model developed by Anthropic, focusing on safety and alignment in AI interactions.

Committee of Machines

CoM

The Committee of Machines is a theoretical framework for understanding AI decision-making processes and ethics.

Confirmation Bias in AI

CBAI

Confirmation Bias in AI refers to the tendency of algorithms to favor information that confirms existing beliefs or assumptions.

Constitutional AI

CAI

Constitutional AI refers to AI systems designed to adhere to ethical guidelines and principles, ensuring responsible decision-making.

Constitutional Prompting

Constitutional Prompting is a method for ensuring AI behavior aligns with human values and ethical guidelines.

Context Poisoning

Context poisoning is an adversarial technique that manipulates the context provided to AI models to produce biased outputs.

Counterfactuals

Counterfactuals refer to hypothetical scenarios exploring 'what if' questions about events that did not occur.

Data Snooping

Data snooping refers to the misuse of data analysis methods to find patterns that do not generalize to unseen data.

Debiasing Word Embeddings

Debiasing word embeddings involves techniques to reduce bias in AI language models.

Deliberative Alignment

Deliberative Alignment ensures AI systems reflect human values through collaborative decision-making processes.

Dual-Use Risk

Dual-Use Risk refers to the potential for technologies to be used for both beneficial and harmful purposes.

Embedding Alignment

EA

Embedding alignment refers to the process of ensuring that AI-generated representations match human values and intentions.

Emergent Deception

Emergent Deception refers to AI systems generating misleading or false information unintentionally during interactions.

Epistemic Humility Score

EHS

The Epistemic Humility Score measures an AI's ability to recognize and express uncertainty in its knowledge.

Evaluating AI

Evaluating AI involves assessing AI systems to ensure effectiveness, accuracy, and alignment with intended goals.