Linear regression models the scores of one variable through the score of another variable (second variable). Here, the predicting variable is denoted as Y. The criterion variable is also referred to as response, dependent or outcome variable. On the other side, the variable on which our prediction is based on is denoted as z. This predictor variable is as refer as the explanatory or independent variable. When our study or result of a criterion variable is dependent on only one predictor variable it is called simple linear regression.
Regression line: Linear Regression involves finding the best fitting straight line through the points of Y predications when plotted as a function of X. the best fitting line constructed based on points is called a regression line. On the other hand, the vertical lines drawn from the points to the best-fitting line are defined as an error in prediction or result. Hence closer the point to the regression lines smaller the error in prediction.
The error of the prediction for a point is calculated by taking the value of the point and then subtracting it by the value predicted value (the value that falls on the line). There is no specific method for defining the best fitting line. The most considerable method for calculating linear regression line is to draw a straight line that reduces the sum of the squared errors of the prediction.
Tip to remember: For calculating the linear regression we only focus on the statistical relationship, where, the relation between the two variables is not perfect. In linear regression, we avoid deterministic relationship.
Some of the examples of statistical relationship are:
Height and weight: with the increase in height one can expect an increase in weight too. However, the amount of weight gain can’t be defined perfectly about the high increase.
Increase in the poor diet leads to a decrease in health: Over the year as the amount of poor diet increase, one can expect a fall in health. However, the amount of effect cannot be defined perfectly.