Types of Linear Regression
Last updated
Last updated
There are two main types of linear regression and two additional types, Let's take a look at them.
Used when there is only one independent variable. For example, predicting a person’s based on their height.
This is the simplest form of linear regression, and it involves only one independent variable and one dependent variable. The equation for simple linear regression is:
where:
is the dependent variable
is the independent variable
is the intercept
is the slope
Example:
Predicting house price based on house size.
Involves two or more independent variables, such as predicting house prices based on factors like area, number of rooms, and location.
This involves more than one independent variable and one dependent variable. The equation for multiple linear regression is:
where:
is the dependent variable
are the independent variables
is the intercept
are the slopes
Example:
Predicting motor MPG (Miles Per Gallon) based on horse power and weight.
Techniques like Lasso (Least Absolute Shrinkage and Selection Operator) and Ridge Regression add penalties to the model for higher coefficients, reducing overfitting and improving model generalization.
Regularization introduces a penalty term to the linear regression cost function, helping prevent overfitting by constraining large coefficients.
In Lasso regression, the penalty added is proportional to the absolute value of each coefficient. The cost function to minimize is:
where:
• is the cost function,
• is the actual value,
• is the predicted value,
• is the regularization parameter controlling the penalty’s strength,
• represents the coefficients of each feature.
The L1 -norm penalty often results in some coefficients being reduced to zero, effectively performing feature selection.
Ridge regression uses an L2 -norm penalty, where the sum of the squared coefficients is added to the cost function:
Here:
• is the L2 -norm, which penalizes large coefficients more smoothly than Lasso.
This penalty tends to reduce the impact of all coefficients but doesn’t force any to zero, making Ridge regression useful when all features are considered potentially valuable.
These formulas are essential for understanding how polynomial and regularized linear regressions adjust the model’s complexity and enhance its ability to generalize across diverse datasets.