H

Clustering Hiérarchique Agglomératif

HAC

Le clustering hiérarchique agglomératif (HAC) est une méthode d'analyse de regroupement qui cherche à construire une hiérarchie de clusters.

Clustering Hiérarchique Agglomératif

Hiérarchique Clustering agglomératif (HAC) is a popular method in the field of analyse de clusters that aims to group data points into hierarchical structures. It operates on the principle of starting with individual data points and progressively merging them into larger clusters. This process continues until all data points are part of a single cluster or until a specified number of clusters is achieved.

The algorithm works as follows: initially, each data point is considered a separate cluster. The closest two clusters are identified based on a métrique de distance (such as Distance Euclidienne), and they are merged to form a new cluster. This merging process is repeated iteratively, and at each step, the algorithm recalculates the distances between the newly formed cluster and the remaining clusters, allowing for a dynamic adjustment of the cluster structure.

HAC can be visualized using a dendrogram, which is a tree-like diagram that illustrates the arrangement of clusters and their relationships. The height of the branches in the dendrogram represents the distance or dissimilarity between the merged clusters. This visualization helps in deciding the optimal number of clusters by setting a threshold distance at which to cut the dendrogram.

There are different linkage criteria used in HAC, including single-linkage (minimum distance), complete-linkage (maximum distance), and average-linkage (mean distance), each affecting the shape and size of the resulting clusters. HAC is particularly useful for analyse exploratoire des données, as it does not require a predetermined number of clusters and can reveal the underlying structure of the data.

oEmbed (JSON) + /