El subajuste es un problema común en aprendizaje automático and modelado estadístico where a model is too simplistic to accurately represent the data it is trying to learn from. This happens when the model has insufficient complexity, meaning it cannot capture the relationships and patterns inherent in the dataset.
Por ejemplo, si intentas ajustar un modelo lineal to data that actually follows a complex, non-linear trend, the model will fail to learn the true structure of the data. This results in poor performance both on the training dataset and on new, unseen data. Essentially, an underfitted model has high bias and low variance, leading to generalized errors and inadequate predictions.
Hay varias razones por las que puede ocurrir el subajuste:
- Complejidad Insuficiente del Modelo: Using a model that is not complex enough, such as a regresión lineal modelo para un problema no lineal.
- Características Inadecuadas: Not including enough relevant features in the model can prevent it from capturing essential patterns.
- Exceso de Regularización: Applying too much regularization can overly constrain the model, making it too simple.
To address underfitting, one can try to increase the complexity of the model by selecting a more sophisticated algorithm, adding relevant features, or reducing the regularization. Diagnosing underfitting typically involves evaluating the model’s métricas de rendimiento, such as accuracy, precision, or loss, on both the training and validation datasets. If the model performs poorly on both, underfitting is likely the cause.