back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [ 93 ] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


93

the parameters, we can still use the equation for purposes of prediction if the evolution of x, during the prediction period is the same as in the estimation period, (i) Consider the model

y, = a + Px, + u,

u, = pM, , + e, 0 < p < 1

e, are IN(0, cr). By regressing , on „ it is possible to get more efficient estimates of p than by regressing y, on x,. (j) The Durbin-Watson test is a useless test because it is inapplicable in almost every situation that we encounter in practice.

5. The phrase "since the model contains a lagged dependent variable, the DW statistic is unreliable" is frequently seen in empirical work.

(a) What does this phrase mean?

(b) Is there some way to get around this problem?

6. Apply the LM test to test for first-order and second-order serial correlation in errors for the estimation of some multiple regression models with the data sets presented in Chapter 4. In each case compare the results with those obtained by using the DW test and Durbins / -test if there are lagged dependent variables in the explanatory variables.

7. Apply Sargans common factor test to check that the significant serial correlation is not due to misspecified dynamics.

8. In the case of data with housing starts in Table 4.10 illustrate the use of fourth-order autocorrelation using the DW test and the LM test.



Multicollinearity

7.1 Introduction

7.2 Some Illustrative Examples

7.3 Some Measures of Multicollinearity

7.4 Problems with Measuring Multicollinearity

7.5 Solutions to the Multicollinearity Problem: Ridge Regression

7.6 Principal Component Regression

7.7 Dropping Variables

7.8 Miscellaneous Other Solutions Summary

Exercises

Appendix to Chapter 7

7.1 Introduction

Very often the data we use in multiple regression analysis cannot give decisive answers to the questions we pose. This is because the standard errors are very high or the r-ratios are very low. The confidence intervals for the parameters of interest are thus very wide. This sort of situation occurs when the explanatory variables display little variation and/or high intercorrelations. The situation where the explanatory variables are highly intercorrelated is referred to as multicollinearity. When the explanatory variables are highly intercorrelated, it



270 MULTICOLLINEARITY

becomes difficult to disentangle the separate effects of each of the explanatory variables on the explained variable. The practical questions we need to ask is how high these intercorrelations have to be to cause problems in our inference about the individual parameters and what we can do about this problem. We argue in the subsequent sections that high intercorrelations among the explanatory variables need not necessarily create a problem and some solutions often suggested for the multicoUinearity problem can actually lead us on a wrong track. The suggested cures are sometimes worse than the disease.

The term "multicoUinearity" was first introduced in 1934 by Ragnar Frisch in his book on confluence analysis and referred to a situation where the variables dealt with are subject to two or more relations. In his analysis there was no dichotomy of explained and explanatory variables. It was assumed that all variables were subject to error and given the sample variances and covariances, the problem was to estimate the different linear relationships among the true variables. The problem was thus one of errors in variables. We will, however, be discussing the multicoUinearity problem as it is commonly discussed in multiple regression analysis, namely, the problem of high intercorrelations among the explanatory variables.

MulticoUinearity or high intercorrelations among the explanatory variables need not necessarily be a problem. Whether or not it is a problem depends on other factors, as we will see presently. Thus the multicoUinearity problem cannot be discussed entirely in terms of the intercorrelations among the variables. Further, different parametrizations of the variables will give different magnitudes of these intercorrelations. This point is explained in the next section with some examples. Most of the discussions of the multicoUinearity problem and its solutions are based on criteria based on the intercorrelations between the explanatory variables. However, this is an incorrect approach, as will be clear from the examples given in the next section.

7.2 Some Illustrative Examples

We first discuss some examples where the intercorrelationships between the explanatory variables are high and study the consequences. Consider the model = . , + Pj-j + . If = 2x,, we have

= p,x, + P2(2x,) + M = (p, + 2P2)x, +

Thus only (p, + 2P2) would be estimable. We cannot get estimates of p, and P2 separately. In this case we say that there is "perfect multicoUinearity," because X, and X2 are perfectly correlated (with r?2 = !)• In actual practice we encounter cases where is not exactly 1 but close to 1.

Ragnar Frisch, Statistical Confluence Analysis by Means of Complete Regression Systems, Pubhcation 5 (Oslo: University Institute of Economics, 1934).



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [ 93 ] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]