AI Glossary: What Is Underfitting? Definition & Meaning

Underfitting is a common issue in machine learning and statistical modeling where a model is too simplistic to accurately represent the data it is trying to learn from. This happens when the model has insufficient complexity, meaning it cannot capture the relationships and patterns inherent in the dataset.

For instance, if you attempt to fit a linear model to data that actually follows a complex, non-linear trend, the model will fail to learn the true structure of the data. This results in poor performance both on the training dataset and on new, unseen data. Essentially, an underfitted model has high bias and low variance, leading to generalized errors and inadequate predictions.

There are several reasons why underfitting might occur:

Insufficient Model Complexity: Using a model that is not complex enough, such as a linear regression model for a non-linear problem.
Inadequate Features: Not including enough relevant features in the model can prevent it from capturing essential patterns.
Excessive Regularization: Applying too much regularization can overly constrain the model, making it too simple.

To address underfitting, one can try to increase the complexity of the model by selecting a more sophisticated algorithm, adding relevant features, or reducing the regularization. Diagnosing underfitting typically involves evaluating the model’s performance metrics, such as accuracy, precision, or loss, on both the training and validation datasets. If the model performs poorly on both, underfitting is likely the cause.