back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [ 95 ] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


95

1 - R

where R} is the squared multiple correlation coefficient between jc, and the other explanatory variables. Looking at the formula (7.4), we can interpret VIF(p,) as the ratio of the actual variance of 0, to what the variance of 0, would have been if x, were to be uncorrelated with the remaining xs. Implicitly, an ideal situation is considered to be one where the xs are all uncorrelated with each other and the VIF, compares the actual situation with an ideal situation. This comparison is not very useful and does not provide us guidance as to what to do with the problem. It is more a complaint that things are not ideal. Also, looking at formula (7.4), as we have discussed earlier, 1/(1 - Rj) is not the only factor determining whether multicoUinearity presents a problem in making inferences.

Whereas the VIF, is something we compute for each explanatory variable separately, the condition number discussed by Raduchel" and Belsley, Kuh, and Welsch is an overall measure. The condition number is supposed to measure the sensitivity of the regression estimates to small changes in the data. It is defined as the square root of the ratio of the largest to the smallest eigenvalue of the matrix XX of the explanatory variables. Eigenvalues are explained in the appendix to this chapter. For the two-variable case in Section 7.2 it is easily computed. We solve the equation

(S„ - \)(S22 - \) - S?2 = 0

(200 - \)(113 - X) - (150)2 = 0

X - 313X + 100 = 0

which gives X, = 312.68, Xj = 0.32 as the required eigenvalues. The condition number = VXj/Xj = 31.26. The closer the condition number is to 1, the better the condition is.

Again, there are three problems with this:

1. It looks al only the correlations among the explanatory variables and formula (7.4) shows that this is not the only relevant factor.

•W. J. Raduchel, "MulticoUinearity Once Again," Paper 205. Harvard Institute of Economic Research, Cambridge, Mass., 1971.

=D. Belsley, E. Kuh, and R. Welsch, Regression Diagnostics (New York: Wiley, 1980).

7.3 Some Measures of MulticoUinearity

It is important to be familiar with two measures that are often suggested in the discussion of multicoUinearity: the variance-inflation factor (VIF) and the condition number. The VIF is defined as

VIF(p,) =



7.3 SOME MEASURES OF MULTICOLLINEARITY " 275

2. The condition number can change by a reparametrization of the variables. For instance, if we define Zi = x, + and Z2 = x, - jCj, the condition number will change. In fact, it can be made equal to 1 with suitable transformations of the variables.

3. Even if such transformations of variables are not always meaningful (what does 2 apples + 3 oranges mean?), the condition number is merely a complaint that things are not ideal.

In Section 7.4 we consider an example where transformations of variables are meaningful. However, even when they are not, the VIF and condition numbers are only measures of how bad things are relative to some ideal situation, but the standard errors and r-ratios will tell a better story of how bad things are. The condition number (CN) is actually a "complaint number."

The VlFs and condition number will be useful for dropping some variables and imposing parameter constraints only in some very extreme cases where - 1.0 or the smallest eigenvalue is very close to zero. In this case we estimate the model subject to some constraints on the paramters. This point is illustrated in Section 7.5 with an example.

The mlajor aspect of the VIF and the condition number is that they look at only the intercorrelations among the explanatory variables. A measure that considers the correlations of the explanatory variable with the explained variable is Theils measure, which is defined as

where = squared multiple correlation from a regression of

on X,, X2, . . . ,X),

RL, = squared multiple correlation from a regression of on x,, Xj, . . , , x with X, omitted

The quantity (/?- - /?- ,) is termed the "incremental contribution" to the squared multiple correlation by Theil. If x,, Xj, . . . , are mutually uncorrelated, then m will be 0 because the incremental contributions all add up to R. In other cases m can be negative as well as highly positive. This makes it difficult to use it for any guidance.

To see what this measure means and how it is related to the /-ratios, let us consider the case of two explanatory variables. Following the notation in Section 4.6, we will write R as Rj ,2. R, are now just the squared simple correlations ri and /2- Thus

m = Rj.n - (Rln - rji) - (Rin - /2)

"E. E. Leamer, "Model Choice and Specification Analysis," in Z. Griliches and M. D. Intrilligator (eds.). Handbook of Econometrics, Vol. 1 (Amsterdam: North-Holland, 1983), pp. 286-330.

H. Theil, Principles of Econometrics (New York: Wiley, 1971), p. 179.



(1 - ,2) = (1 - rj,)(l - /-?2,)

= 1 - rJ, - (1 - rl,)rl,,

Hence

{Rln - rid = (1 - rl,)rl,,

Thus

m = (squared multiple correlation coefficient) - (weighted sum of the partial rs)

This weighted sum is w, • rl2i + r,, 2, where w, = 1 - rJ, and = 1 - rlj- If the partial rs are all very low, m will be very close to multiple R. In the earlier example that we gave to illustrate Kleins measure of multicol-linearity, we had R, = 0.916, 4 = 0.9025, and r, 2 = 4, = 0.14. Thus Theils measure of multicolUnearity is

m = 0.916 - 2(1 - 0.9025)(0.14)

= 0.888

Of course, m is not zero. But is multicoUinearity serious or not? One can never tell. If the number of observations is greater than 60, we will get significant /-ratios. Thus Theils measure is even less useful than VIF and condition number.

We have discussed several measures of multicoUinearity and they are all of limited use from the practical point of view. As Leamer puts it, they are all merely complaints that things are not ideal. The standard errors and /-ratios give us more information about how serious things are. It is relevant to remember formula (7.4) while assessing any measures of multicoUinearity.

Leamer suggests some measures of multicoUinearity based on the sensitivity of inferences to different forms of prior information. Since a discussion of these measures involves a knowledge of bayesian statistical inference and multivariate statistics, we have to omit these measures.

7.4 Problems with Measuring MulticoUinearity

In Section 7.3 we talked of measuring multicoUinearity in terms of the intercorrelations among the explanatory variables. However, there is one problem with this. The intercorrelations can change with a redefinition of the explanatory variables. Some examples will illustrate this point.

The i-ratios are related to the partial rs, 2 and rjj i- We also derived in Section 4.6 the relation that



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [ 95 ] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]