Mathematical explanation for Linear Regression working
Suppose we are given a dataset:
Experience (X)
Salary (y) (in lakhs)
Given is a Work vs Experience dataset of a company and the task is to predict the salary of a employee based on his / her work experience.
This article aims to explain how in reality Linear regression mathematically works when we use a pre-defined function to perform prediction task.
Let us explore how the stuff works when Linear Regression algorithm gets trained.
Iteration 1 – In the start, θ0 and θ1 values are randomly chosen. Let us suppose, θ0=0 and θ1=0.
Predicted values after iteration 1 with Linear regression hypothesis. hθ=[θ0θ1][x0x1x0x2x0x3x0x4]
=[00]⋅[12161517]=[0000]\
Cost Function – Error
J(θ)=2m1∑i=1m[hθ(xi)−yi]2
=2×41[(0−3)2+(0−10)2+(0−4)2+(0−3)2]
=81[9+100+16+9]
=16.75
Gradient Descent – Updating θ0 value
Here, j = 0
θj:=θj−mα∑i=1m[(hθ(xi)−yi)xi]
=0−40.001[(0−3)+(0−10)+(0−4)+(0−3)]
=40.001[−3+(−10)+(−4)+(−3)]
=40.001[20]
=0.005
Gradient Descent – Updating θ1 value
Here, j = 1
=0−40.001[(0−3)2+(0−10)6+(0−4)5+(0−3)7]
=40.001[−6+(−60)+(−20)+(−21)]
=40.001[−107]
=0.02657
Iteration 2 – = 0.005 and θ1 = 0.02657\
Predicted values after iteration 1 with Linear regression hypothesis.
hθ=[θ0θ1][x0x1x0x2x0x3x0x4]
=[0.0050.026]⋅[12161517]=[0.0570.1610.1350.187]\