back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [ 101 ] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


101

7.8 Miscellaneous Other Solutions

There have been several other solutions to the multicoUinearity problem that one finds in the literature. All these, however, should be used only if there are other reasons to use them-not for solving the coUinearity problem as such. We will discuss them briefly.

Using Ratios or First Differences

We have discussed the method of using ratios in our discussion of heteroskedasticity (Chapter 5) and first differences in our discussion of autocorrelation (Chapter 6). Although these procedures might reduce the intercorrelations among the explanatory variables, they should be used on the basis of the considerations discussed in those chapters, not as a solution to the collinearity problem.

Using Extraneous Estimates

This method was followed in early demand studies. It was found that in time-series data income and price were both highly correlated. Hence neither the price nor income elasticity could be estimated with precision. What was done was to get an estimate of the income elasticity from budget studies (where

"E. E. Leamer, "MulticoUinearity: A Bayesian Interpretation,"/Jfview /£ ( 5

tistics. Vol. 55, 1973, pp. 371-380; "Regression Selection Strategies and Revealed Priors,"

Journal of the American Statistical Association, Vol. 73, 1978, 580-587.

What all this discussion shows is that even in the use of the COV or WTD estimators prior information on is very important. This brings us back to the same story as our discussion of the ridge regression and principal components regression, namely, the importance of prior information. The prior information regarding the omission of nuisance variables pertains to the true /-values for the coefficients of these variables.

Leamer* suggests studying the sensitivity of estimates of the coefficients to different specifications about prior information on the coefficients. Although his approach is Bayesian and is beyond the scope of this book, one can do a simple sensitivity analysis in each problem to assess the impact on the estimates of the coefficients of interest of changes in the assumptions about the coefficients of the nuisance parameters. Such sensitivity analysis would be more useful than using one solution Uke ridge regression, principal component regression, omitting variables, and so on, each of which implies some particular prior information in a concealed way. Very often, this may not be the prior information you would want to consider.



"An example of this is J. Tobin, "A Statistical Demand Function for Food in the U.S.A.," Journal of the Royal Statistical Society, Series A, 1950, pp. 113-141.

™John Meyer and Edwin Kuh, "How Extraneous Are Extraneous Estimates?" Review of Economics and Statistics, November 1957.

=G. S. Maddala, "The Likelihood Approach to Pooling Cross-Section and Time-Series Data," Econometrica, Vol. 39, November 1971, pp. 939-953.

prices do not vary much) use this estimate to "correct" the quantity series for income variation and then estimate the price elasticity. For example, if the equation to be estimated is

log e = ct + log p + 2 log +

we first get 02 from budget studies and then regress (log Q - fiz log y) on log p to get estimates of a and ,. Here 0, is known as the "extraneous estimate." There are two main problems with this procedure. First, the fact that 2 is estimated should be taken into account in computing the variances of a and 01. This is not usually done, but it can be. Second, and this is the more important problem, the cross-section estimate of 2 niay be measuring something entirely different from what the time-series estimate is supposed to measure. As Meyer and Kuh" argue, the "extraneous" estimate can be really extraneous.

Suppose that we want to use an estimate for a parameter from another data set. What is the best procedure for doing this? Consider the equation

y, = p,x, + p22 + (7.18)

Suppose that because of the high correlation between x, and X2, we cannot get good estimates of (3, and p2- We try to get an estimate of , from another data set and another equation

2 = (3,x, + yz + V (7.19)

In this equation x, and z are not highly correlated and we get a good estimate of (3, say 0,. Now we substitute this in (7.18) and regress (y, - 0,x,) on X2 to get an estimate 02 of - This is the procedure we mentioned earlier. The estimate of 02 is a conditional estimate, conditional on , = 0,. Also, we have to make corrections fur the estimated variance of 02 because the error in the equation now is

(y - 0,x,) = ( 2 2 + W

where W = + (p, - 0,) , is not the same as u. This procedure is advisable only when the data behind the estimation of (7.19) are not available to us (the study is done by somebody else).

On the other hand, if the two sets of data are available to us, there is no reason to use this conditional estimation procedure. A better procedure would be to estimate equations (7.18) and (7.19) jointly. This is what was done by Maddala for the data used by Tobin in his study on demand for food. It is also possible to test, by using the joint estimation of equations (7.18) and (7.19) and



Summary

1. In multiple regression analysis it is usually difficult to interpret the estimates of the individual coefficients if the variables are highly intercorrelated. This problem is often referred to as the multicoUinearity problem.

2. However, high intercorrelations among the explanatory variables by themselves need not necessarily cause any problems in inference. Whether or not this is a problem will depend on the magnitude of the error variance and the variances of the explanatory variables. If there is enough variation in the explanatory variables and the variance of the error term is sufficiently small, high intercorrelations among the explanatory variables need not cause a problem. This is illustrated by using formulas (7.1)-(7.4) in Section 7.2.

3. Measures of multicoUinearity based solely on high intercorrelations among the explanatory variables are useless. These are discussed in Section 7.3. Also, as shown in Section 7.4, these correlations can change with simple transformations of the explanatory variables. This does not mean that the problem has been solved.

4. There have been several solutions to the multicoUinearity problem. These are:

(a) Ridge regression, on which an enormous amount of literature exists.

(b) Principal component regression-this amounts to transforming the explanatory variables to an uncorrelated set, but mere transformation does not solve the problem. (It appears through the back door.)

-In the case of Tobins demand for food example, this test done by Maddala showed that there were significant differences between the two parameters.

separate estimation of the equations, whether the coefficient of x, is the same in the two equations.22

In summary, as a solution to the multicoUinearity problem, it is not advisable to substitute extraneous parameter estimates in the equation. One can, of course, pool the different data sets to get more efficient estimates of the parameters, but one should also perform some tests to see whether the parameters in the different equations are indeed the same.

Getting More Data

One solution to the multicoUinearity problem that is often suggested is to "go and get more data." ActuaUy, the extraneous estimators case we have discussed also falls in this category (we look for another model with common parameters and the associated data set). Sometimes using quarterly or monthly data instead of annual data helps us in getting better estimates. However, we might be adding more sources of variation like seasonality. In any case, since weak data and inadequate information are the sources of our problem, getting more data will help matters.



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [ 101 ] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]