The simple linear regression model
We consider the modelling between the dependent and one independent variable. When there is only one
independent variable in the linear regression model, the model is generally termed as a simple linear
regression model. When there are more than one independent variables in the model, then the linear model
is termed as the multiple linear regression model.
The linear model
Consider a simple linear regression model
y 0 1 X
where y is termed as the dependent or study variable and X is termed as the independent or explanatory
variable. The terms 0 and 1 are the parameters of the model. The parameter 0 is termed as an intercept
term, and the parameter 1 is termed as the slope parameter. These parameters are usually called as
regression coefficients. The unobservable error component accounts for the failure of data to lie on the
straight line and represents the difference between the true and observed realization of y . There can be
several reasons for such difference, e.g., the effect of all deleted variables in the model, variables may be
qualitative, inherent randomness in the observations etc. We assume that is observed as independent and
identically distributed random variable with mean zero and constant variance 2 . Later, we will additionally
assume that is normally distributed.
The independent variables are viewed as controlled by the experimenter, so it is considered as non-stochastic
whereas y is viewed as a random variable with
E ( y ) 0 1 X
and
Var ( y ) 2 .
Sometimes X can also be a random variable. In such a case, instead of the sample mean and sample
variance of y , we consider the conditional mean of y given X x as
E ( y | x) 0 1 x
When the values of 0 , 1 and 2 are known, the model is completely described. The parameters 0 , 1 and
2 are generally unknown in practice and is unobserved. The determination of the statistical model
y 0 1 X depends on the determination (i.e., estimation ) of 0 , 1 and 2 . In order to know the
values of these parameters, n pairs of observations ( xi , yi )(i 1,..., n) on ( X , y ) are observed/collected and
are used to determine these unknown parameters.
Various methods of estimation can be used to determine the estimates of the parameters. Among them, the
methods of least squares and maximum likelihood are the popular methods of estimation.
Least squares estimation
Suppose a sample of n sets of paired observations ( xi , yi ) (i 1, 2,..., n) is available. These observations
are assumed to satisfy the simple linear regression model, and so we can write
yi 0 1 xi i (i 1, 2,..., n).
The principle of least squares estimates the parameters 0 and 1 by minimizing the sum of squares of the
difference between the observations and the line in the scatter diagram. Such an idea is viewed from different
perspectives. When the vertical difference between the observations and the line in the scatter diagram is
considered, and its sum of squares is minimized to obtain the estimates of 0 and 1 , the method is known
as direct regression. yi
(xi,
Y 0 1 X
(Xi,
xi
Direct regression
Econometrics | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
2
,Alternatively, the sum of squares of the difference between the observations and the line in the horizontal
direction in the scatter diagram can be minimized to obtain the estimates of 0 and 1 . This is known as a
reverse (or inverse) regression method.
yi
Y 0 1 X
(xi, yi)
(Xi, Yi)
xi,
Reverse regression method
Instead of horizontal or vertical errors, if the sum of squares of perpendicular distances between the
observations and the line in the scatter diagram is minimized to obtain the estimates of 0 and 1 , the
method is known as orthogonal regression or major axis regression method.
yi
(xi
Y 0 1 X
(Xi
)
xi
Major axis regression method
Econometrics | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
3
, Instead of minimizing the distance, the area can also be minimized. The reduced major axis regression
method minimizes the sum of the areas of rectangles defined between the observed data points and the
nearest point on the line in the scatter diagram to obtain the estimates of regression coefficients. This is
shown in the following figure:
yi
(xi yi)
Y 0 1 X
(Xi, Yi)
xi
Reduced major axis method
The method of least absolute deviation regression considers the sum of the absolute deviation of the
observations from the line in the vertical direction in the scatter diagram as in the case of direct regression to
obtain the estimates of 0 and 1 .
No assumption is required about the form of the probability distribution of i in deriving the least squares
estimates. For the purpose of deriving the statistical inferences only, we assume that i ' s are random
variable with E ( i ) 0, Var ( i ) 2 and Cov ( i , j ) 0 for all i j (i, j 1, 2,..., n). This assumption is
needed to find the mean, variance and other properties of the least-squares estimates. The assumption that
i ' s are normally distributed is utilized while constructing the tests of hypotheses and confidence intervals
of the parameters.
Based on these approaches, different estimates of 0 and 1 are obtained which have different statistical
properties. Among them, the direct regression approach is more popular. Generally, the direct regression
estimates are referred to as the least-squares estimates or ordinary least squares estimates.
Econometrics | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
4
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller partwi085. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $4.49. You're not tied to anything after your purchase.