Preview

Multicollinearity

Powerful Essays
Open Document
Open Document
3620 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Multicollinearity
Multicollinearity
What multicollinearity is. Let H = the set of all the X (independent) variables. Let Gk = the set of all the X variables except Xk. The formula for standard errors is then
2 s 1 − RYH * y 2 (1 − RX k Gk ) * ( N − K − 1) s X k

sbk =

=

2 s 1 − RYH * y Tolk * ( N − K − 1) s X k

= Vif k *

2 s 1 − RYH * y ( N − K − 1) s X k

The bigger R2XkGk is (i.e. the more highly correlated Xk is with the other IVs in the model), the bigger the standard error will be. Indeed, if Xk is perfectly correlated with the other IVs, the standard error will equal infinity. This is referred to as the problem of multicollinearity. The problem is that, as the Xs become more highly correlated, it becomes more and more difficult to determine which X is actually producing the effect on Y. Also, recall that 1 - R2XkGk is referred to as the Tolerance of Xk. A tolerance close to 1 means there is little multicollinearity, whereas a value close to 0 suggests that multicollinearity may be a threat. The reciprocal of the tolerance is known as the Variance Inflation Factor (VIF). The VIF shows us how much the variance of the coefficient estimate is being inflated by multicollinearity. The square root of the VIF tells you how much larger the standard error is, compared with what it would be if that variable were uncorrelated with the other X variables in the equation. For example, if the VIF for a variable were 9, its standard error would be three times as large as it would be if its VIF was 1. In such a case, the coefficient would have to be 3 times as large to be statistically significant. Causes of multicollinearity • • Improper use of dummy variables (e.g. failure to exclude one category) Including a variable that is computed from other variables in the equation (e.g. family income = husband’s income + wife’s income, and the regression includes all 3 income measures) In effect, including the same or almost the same variable twice (height in feet and height in

You May Also Find These Documents Helpful