back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [ 99 ] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


99

Table 7.3 Imports, Production, Stock Formation, and Consumption in France (Millions of New Francs at 1959 Prices)

Gross

Domestic

Stock

Imports,

Production,

Formation,

Consumption,

Years

1949

15.9

149.3

108.1

1950

16.4

161.2

114.8

1951

19.0

171.5

123.2

1952

19.1

175.5

126.9

1953

18.8

180.8

132.1

1954

20.4

190.7

137.7

1955

22.7

202.1

146.0

1956

26.5

212.4

154.1

1957

28.1

226.1

162.3

1958

27.6

231.9

164.3

1959

26.3

239.0

167.6

1960

31.1

258.0

176.8

1961

33.3

269.8

186.6

1962

37.0

288.4

199.7

1963

43.3

304.5

213.9

1964

49.0

323.4

223.8

1965

50.3

336.8

232.0

1966

56.6

353.9

242.9

Source. E. Mallinvaud, Statistical Mettiods of Econometrics, 2nd ed. (Amsterdam: North-Holland, 1970), p. 19.

The is very high and the F-ratio is highly significant but the individual t-ratios are all insignificant. This is evidence of the multicoUinearity problem. Chatterjee and Price argue that before any further analysis is made, we should look at the residuals from this equation. They find (we are omitting the residual plot here) a distinctive pattern-the residuals declining until 1960 and then rising. Chatterjee and Price argue that the difficulty with the model is that the European Common Market began operations in 1960, causing changes in import-export relationships. Hence they drop the years after 1959 and consider only the 11 years 1949-59. The regression results now are as follows:

Variable

Coefficient

-0.051

0.070

-0.731

0.587

0.905

6.203

0.287

0.102

2.807

Constant

-10.13

1.212

-8.355

At = 11 R

= 0.992



The residual plot (not shown here) is now satisfactory (there are no systematic patterns), so we can proceed. Even though the is very high, the coefficient of x, is not significant. There is thus a multicollinearity problem.

To see what should be done about it, we first look at the simple correlations among the explanatory variables. These are = 0.026, r], = 0.99, and I23 = 0.036. We suspect that the high correlation between x, and Xj could be the source of the trouble.

Does principal component analysis help us? First, the principal components (obtained from a principal components program)"" are:

Zi = 0.7063Z, -I- 0.04352 + 0.7065 Z2 = -0.0357Z, + 0.99902 - 0.02583 Zj = -0.7070Z - 0.00702 + 0.7072

Xl, X2, Xj are the normalized values ofx,, Xj, x,. That is, AT, = (x, - ,)/( ,, X2 = (X2 - m2)/(T2, and AT, - ( - mh, where m,, mj, are the means and (T,, aj, a, are the standard deviations of x„ Xj, x,, respectively. Hence

var(A:,) = var(A2) = var(A:3) = 1

The variances of the principal components are

var(z,) = 1.999 var(z2) = 0.998 var(z3) = 0.003

Note that var(z,) = E var(Z,) = 3. The fact that var(z3) = 0 identifies that Unear function as the source of multicollinearity. In this example there is only one such linear function. In some examples there could be more. Since ,) = 2) = { = 0 because of normalization, the zs have mean zero. Thus Z3 has mean zero and its variance is also close to zero. Thus we can say that Z3 ~ 0. Looking at the coefficients of the X\, we can say that (ignoring the coefficients that are very small)

Zl = 0.706(, + Xy)

Z2X2

Z3 - 0.707(3 - Xl) Z3 - 0 gives us Xl - 3

Actually, we would have gotten the same result from a regression of Xj on AT,. The regressison coefficient is r„ = 0.9984. (Note that AT, and X are in standardized form. Hence the regression coefficient is , .)

In terms of the original (nonnormalized) variables the regression of x, on x,

X3 = 6.258 + 0.686X, = 0.998

(0.0077)

"These are from Chatterjee and Price, Regression Analysis, p. 161. The details of how the principal components are computed need not concern us here.



Variable

Coefficient

0.145

0.007

20.67

0.622

0.128

4.87

Constant

-8.440

1.435

-5.88

= 0.983

Of course, we can estimate a regression of X and Xj. The regression coefficient is 1.451. We now substitute for x, and estimate a regressison of on x, and x3. The results we get are slightly better (we get a higher R). The results are:

Variable

Coefficient

0.596

0.091

6.55

0.212

0.007

29.18

Constant

-9.743

1.059

-9.20

R = 0.991

The coefficient of x3 now is (P3 + 1.451P).

We can get separate estimates of p, and only if we have some prior information. As this example as well as the example in Section 7.3 indicate, what multicoUinearity implies is that we cannot estimate individual coefficients with good precision but can estimate some linear functions of the parameters with good precision. If we want to estimate the individual parameters, we would need some prior information. We will show that the use of principal components implies the use of some prior information about the restrictions on the parameters.

Suppose that we consider regressing on the principal components z, and Z2 (Zi is omitted because it is almost zero). We saw that z, = 0.7( -I- X) and

(The figure in parentheses is the SE.)

In a way we have got no more information from the principal component analysis than from a study of the simple correlations in this example. Anyway, what is the solution now? Given that there is an almost exact relationship between X, and x„ we cannot hope to estimate the coefficients of x, and Xj separately. If the original equation is

= Po + Pll + P22 + + «

then substituting for in terms of we get

= (po -I- 6.258 ) + (Pi + 0.68 ) + px -I-

This gives the linear functions of the Ps that are estimable. They are -6.258 , p, + O.686P3, and P2. The regression of on X and Xj gave the following results.



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [ 99 ] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]