AI Glossary: What Is Fold Cross-Validation? Definition & Meaning

Fold-Kreuzvalidierung is a robust method im maschinellen Lernen and statistics to evaluate a model’s performance and ensure its generalizability. The technique involves dividing a dataset into several subsets, or ‘folds.’ Typically, the dataset is split into ‘k’ equal folds. The model is then trained on ‘k-1’ folds and tested on the remaining fold. This process is repeated ‘k’ times, with each fold being used as the test set exactly once.

Der Hauptzweck von Fold Cross-Validation ist es, zu beurteilen, wie gut die statistische Analyse performs on an independent dataset. By averaging the results from each of the ‘k’ iterations, practitioners can obtain a more reliable estimate of the model’s predictive performance. This method is particularly effective in preventing issues such as overfitting, where a model performs well on the training data but poorly on unseen data.

Eine der häufigsten Varianten ist k-fache Kreuzvalidierung, where ‘k’ is typically chosen as 5 or 10. This choice balances the trade-off between bias and variance; fewer folds can lead to higher bias, while more folds can increase the variance of the performance estimate. Another variant is stratifizierte k-fache Kreuzvalidierung, which ensures that each fold maintains the same proportion of different classes as the entire dataset, making it especially useful for unausgewogene Datensätze.

In summary, Fold Cross-Validation is a critical technique for evaluating machine learning models, providing insights into their performance and robustness gegen Overfitting.