Skip to main content
Machine Learning : Linear regression
Linear regression attempts to model the relationship between two
variables by fitting a linear equation to observed data.
One variable is
considered to be an explanatory variable, and the other is considered
to be a dependent variable.
Dependent Variable – Variable who’s values we want to explain or
forecast
Independent or explanatory Variable that Explains the other variable.
Values are independent.
Dependent variable can be denoted as y, so imagine a child always
asking y is he dependent on his parents.
And then you can imagine the X as your ex boyfriend/girlfriend who is
independent because they don’t need or depend on you. A good way to
remember it.
Anyways
Used for 2 Applications
To Establish if there is a relation between 2 variables or see if there
is statistically signification relationship between the two variables-
• To see how increase in sin tax has an effect on how many cigarettes
packs are consumed
• Sleep hours vs test scores
• Experience vs Salary
• Pokemon vs Urban Density
• House floor area vs House price
Forecast new observations – Can use what we know to forecast unobserved
values
Here are some other examples of ways that linear regression can be
applied.
• So say the sales of ROI of Fidget spinners over time.
• Stock price over time
• Predict price of Bitcoin over time.
Linear Regression is also known as the line of best fit
The line of best fit can be represented by the linear equation y = a +
bx or y = mx + b or y = b0+b1x
You most likely learnt this in school.
So b is is the intercept, if you increase this variable, your intercept
moves up or down along the y axis.
M is your slope or gradient, if you change this, then your line rotates
along the intercept.
Data is actually a series of x and y observations as shown on this
scatter plot. They do not follow a straight line however they do follow a
linear pattern hence the term linear regression
Assuming we already have the best fit line, We can calculate the error
term Epsilon. Also known as the Residual. And this is the term that we
would like to minimize along all the points in the data series.
So say if we have our linear equation but also represented in statistical notation. The residual fit in to our equation as shown y =
b0+b1x + e
Thanks for sharing the info. python training in Chennai
ReplyDelete