AI Glossary: What Is Biclustering? Definition & Meaning

バイクラスタリング

バイクラスタリング、またはコークラスタリングは、データ分析手法です used primarily in the fields of statistics and machine learning. It aims to discover patterns in data by simultaneously clustering both rows and columns of a data matrix. This approach is particularly useful in scenarios where the data is organized in a two-dimensional format, such as gene expression data or customer-item matrices.

The primary goal of biclustering is to find coherent subsets of data that have similar characteristics across both dimensions. For example, in a gene expression dataset, one might want to identify groups of genes that exhibit similar expression patterns across specific conditions. Similarly, in a market basket analysis, biclustering can help identify groups of customers who purchase similar items under certain conditions.

Biclustering algorithms can be broadly categorized into two types: those that optimize the 全体構造 of the data matrix and those that focus on local coherence within specific subsets. Some popular biclustering methods include:

スペクトルバイクラスタリング： Utilizes spectral methods to find biclusters by decomposing the data matrix into its 固有値と固有ベクトルを見つける。
ランダムウォークバイクラスタリング： グラフ上のランダムウォークを利用して、重複するバイクラスタを見つける。
グラフベースの手法： Leverage グラフ理論行と列の接続に基づいてバイクラスタを検出します。

Applications of biclustering span various domains including bioinformatics, market research, and social network analysis. It provides a powerful tool for uncovering hidden patterns and relationships in complex datasets, making it a valuable technique in data mining and 探索的データ分析.