AI Glossary: What Is Cross-Validation Fold (CV Fold)? Definition & Meaning

A validação cruzada é um método estatístico usado para avaliar o desempenho de aprendizado de máquina models. It involves partitioning a dataset into several subsets, known as ‘folds.’ A fold de validação cruzada refers to one of these subsets. The main goal of using folds is to ensure that each model is evaluated on different portions of the dataset, which helps in understanding how well the model generalizes to unseen data.

Typically, the process of cross-validation involves the following steps: First, the complete dataset is divided into ‘k’ equally sized folds. For each iteration, one fold is reserved for testing, while the remaining ‘k-1’ folds are used for training the model. This process is repeated ‘k’ times, with each fold being used as the test set exactly once. At the end of the procedure, the desempenho específicas (like accuracy, precision, recall, etc.) from each iteration can be averaged to give an desempenho geral medida do modelo.

Tipos comuns de validação cruzada incluem validação cruzada k-fold, where ‘k’ can be any integer (commonly 5 or 10), and validação cruzada estratificada k-fold, which maintains the distribution of target classes in each fold, ensuring a more representative sample.

Using cross-validation folds helps mitigate issues like overfitting, as the model is validated on multiple subsets of data rather than just a single train-test split. This method provides a more reliable estimate of a model’s performance and is a standard practice in avaliação de modelos de aprendizado de máquina.