R

Régression Ridge

RR

La régression Ridge est une technique qui améliore la régression linéaire en ajoutant une pénalité pour des coefficients plus grands.

Qu'est-ce que la régression Ridge ?

Crête Régression, also known as Tikhonov regularization, is a type of régression linéaire that includes a regularization term to prevent overfitting. This technique is particularly useful when dealing with multicollinearity, where independent variables are highly correlated.

In standard linear regression, the goal is to minimize the sum of the squared differences between the observed and predicted values. However, when the model is too complex or when there are many predictors, it can lead to overfitting, where the model performs well on données d'entraînement mais de mauvaise qualité sur des données non vues.

La régression ridge aborde ce problème en ajoutant un terme de pénalité au fonction de perte, which is proportional to the square of the magnitude of the coefficients. The modified loss function can be expressed as:

Perte = Somme Résiduelle des Carrés + λ * (Somme des Carrés des Coefficients)

Here, λ (lambda) is a tuning parameter that controls the strength of the penalty. A larger value of λ increases the penalty on the coefficients, leading to smaller coefficient values. This helps in making the model more generalizable by reducing its complexité.

Ridge Regression is particularly effective when you have many predictors and a smaller number of observations, often leading to a model that performs better on test data compared to regular linear regression. It is important to note that while Ridge Regression can shrink coefficients, it does not perform variable selection (i.e., it does not set any coefficients exactly to zero). This is where techniques like Régression Lasso, which can perform variable selection, come in handy.

Dans l'ensemble, la régression ridge est un outil puissant dans le apprentissage automatique toolbox, helping to create robust predictive models by balancing the trade-off between fitting the training data and maintaining model simplicity.

oEmbed (JSON) + /