100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Summary Chapter7-Econometrics-Multicollinearity $4.49   Add to cart

Summary

Summary Chapter7-Econometrics-Multicollinearity

 5 views  0 purchase
  • Course
  • Institution

Chapter7-Econometrics-Multicollinearity

Preview 3 out of 25  pages

  • January 18, 2022
  • 25
  • 2021/2022
  • Summary
avatar-seller
Chapter 7
Multicollinearity
A basic assumption is multiple linear regression model is that the rank of the matrix of observations on
explanatory variables is the same as the number of explanatory variables. In other words, such a matrix is
of full column rank. This, in turn, implies that all the explanatory variables are independent, i.e., there is
no linear relationship among the explanatory variables. It is termed that the explanatory variables are
orthogonal.


In many situations in practice, the explanatory variables may not remain independent due to various
reasons. The situation where the explanatory variables are highly intercorrelated is referred to as
multicollinearity.


Consider the multiple regression model
y  X    ,  ~ N (0,  2 I )
nk k 1 n1


with k explanatory variables X 1 , X 2 ,..., X k with usual assumptions including Rank ( X )  k .


Assume the observations on all X i ' s and yi ' s are centered and scaled to unit length. So

- X ' X becomes a k  k matrix of correlation coefficients between the explanatory variables and
- X ' y becomes a k 1 vector of correlation coefficients between explanatory and study variables.


Let X   X 1 , X 2 ,..., X k  where X j is the j th column of X denoting the n observations on X j . The

column vectors X 1 , X 2 ,..., X k are linearly dependent if there exists a set of constants 1 ,  2 ,...,  k , not all

zero, such that
k

 j 1
j X j  0.

If this holds exactly for a subset of the X 1 , X 2 ,..., X k , then rank ( X ' X )  k . Consequently ( X ' X ) 1 does
k
not exist. If the condition 
j 1
j X j  0 is approximately true for some subset of X 1 , X 2 ,..., X k , then there

will be a near-linear dependency in X ' X . In such a case, the multicollinearity problem exists. It is also
said that X ' X becomes ill-conditioned.


Econometrics | Chapter 7 | Multicollinearity | Shalabh, IIT Kanpur
1

,Source of multicollinearity:
1. Method of data collection:
It is expected that the data is collected over the whole cross-section of variables. It may happen that the
data is collected over a subspace of the explanatory variables where the variables are linearly dependent.
For example, sampling is done only over a limited range of explanatory variables in the population.


2. Model and population constraints
There may exist some constraints on the model or on the population from where the sample is drawn. The
sample may be generated from that part of the population having linear combinations.


3. Existence of identities or definitional relationships:
There may exist some relationships among the variables which may be due to the definition of variables or
any identity relation among them. For example, if data is collected on the variables like income, saving
and expenditure, then income = saving + expenditure. Such a relationship will not change even when the
sample size increases.


4. Imprecise formulation of model
The formulation of the model may unnecessarily be complicated. For example, the quadratic (or
polynomial) terms or cross-product terms may appear as explanatory variables. For example, let there be 3
variables X 1 , X 2 and X 3 , so k  3. Suppose their cross-product terms X 1 X 2 , X 2 X 3 and X 1 X 3 are also

added. Then k rises to 6.


5. An over-determined model
Sometimes, due to over-enthusiasm, a large number of variables are included in the model to make it more
realistic. Consequently, the number of observations (n ) becomes smaller than the number of explanatory
variables (k ) . Such a situation can arise in medical research where the number of patients may be small,
but the information is collected on a large number of variables. In another example, if there is time-series
data for 50 years on consumption pattern, then it is expected that the consumption pattern does not remain
the same for 50 years. So better option is to choose a smaller number of variables, and hence it results in
n  k.



Econometrics | Chapter 7 | Multicollinearity | Shalabh, IIT Kanpur
2

, Consequences of multicollinearity
To illustrate the consequences of the presence of multicollinearity, consider a model
y  1 x1   2 x2   , E ( )  0, V ( )   2 I

where x1 , x2 and y are scaled to length unity.

The normal equation ( X ' X )b  X ' y in this model becomes

 1 r   b1   r1 y 
     
 r 1   b2   r2 y 
where r is the correlation coefficient between x1 and x2 ; rjy is the correlation coefficient between x j and

y ( j  1, 2) and b   b1 , b2  ' is the OLSE of  .

 1  1 r 
X 'X 
1
 2 
 1  r   r 1 
r1 y  r r2 y
 b1 
1 r2
r2 y  r r1 y
b2  .
1 r2
So the covariance matrix is V (b)   2 ( X ' X ) 1

2
 Var (b1 )  Var (b2 ) 
1 r2
r 2
Cov(b1 , b2 )   .
1 r2
If x1 and x2 are uncorrelated, then r  0 and

1 0
X 'X  
0 1
rank ( X ' X )  2.
If x1 and x2 are perfectly correlated, then r  1 and rank ( X ' X )  1.


If r  1, then Var (b1 )  Var (b2 )   .


So if variables are perfectly collinear, the variance of OLSEs becomes large. This indicates highly
unreliable estimates, and this is an inadmissible situation.



Econometrics | Chapter 7 | Multicollinearity | Shalabh, IIT Kanpur
3

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller partwi085. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $4.49. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

75323 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$4.49
  • (0)
  Add to cart