AI Glossary: What Is K-Medoids (KM)? Definition & Meaning

K-メドイド

K-Medoidsは一種の clustering algorithm that is used to partition a dataset into groups, or clusters, based on similarity. Unlike K-means, which uses centroids (the mean of the points in a cluster) to represent clusters, K-Medoids selects actual data points as the centers of these clusters, known as medoids. This approach makes K-Medoids more robust to noise and outliers in the data.

このアルゴリズムは、いくつかの重要なステップで構成されています：

初期化： Choose ‘k’ initial medoids randomly from the dataset.
割り当て： 割り当て：各データポイントを最も近いメドイドに割り当て、クラスターを作成します。
更新： For each cluster, find the data point that minimizes the total dissimilarity (often measured using distance metrics like Manhattan or ユークリッド距離) すべての他の点に対して。この点が新しいメドイドになります。
繰り返し： Repeat the assignment and update steps until the medoids no longer change or a specified number of iterations is reached.

K-Medoids is particularly useful in scenarios where the dataset is small to medium-sized and when the presence of outliers could skew results. It is widely applied in various fields, including marketing for customer segmentation, biology for species classification, and 画像処理パターン認識のために。

Overall, K-Medoids provides a more stable clustering option compared to K-Means, especially in datasets where outliers are present, as it relies on actual data points rather than calculated averages.