back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [ 102 ] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


102

EXERCISES

( ) Dropping variables.

All of these so-called solutions are really ad hoc procedures. Each implies the use of some prior information and it is better to examine this before undertaking a mechanical solution that has been suggested by others.

5. The basic problem is lack of enough information to answer the questions posed. The only solutions are:

(a) To get more data.

(b) To ask what questions are answerable with the data at hand.

(c) To examine what prior information will be most helpful [in fact, this should precede solution (a)].

Exercises

1. Define the term "multicollinearity." Explain how you would detect its presence in a multiple regression equation you have estimated. What are the consequences of multicollinearity, and what are the solutions?

2. Explain the following methods.

(a) Ridge regression.

(b) Omitted-variable regression.

(c) Principle component regression.

What are the problems these methods are supposed to solve?

3. Examine whether the following statements are true or false. Give an explanation.

(a) In multiple regression, a high correlation in the sample among the regressors (multicollinearity) implies that the least squares estimators of the coefficients are biased.

(b) Whether or not multicollinearity is a problem cannot be decided by just looking at the intercorrelations between the explanatory variables.

(c) If the coefficient estimates in an equation have high standard errors, this is evidence of high multicollinearity.

(d) The relevant question to ask if there is high multicollinearity is not what variables to drop but what other information will help.

4. In a study analyzing the determinants of faculty salaries, the results shown on p. 296 were obtained.- The dependent variable is 1969-1970 academic year salary. We have omitted eight more other explanatory variables.

(a) Do any of the coefficients have unexpected signs?

(b) Is there a multicollinearity problem? What variables do you expect to be highly correlated?

(c) Teacher rating is the only nonsignificant variable among the variables presented. Can you explain why? Note that is a dummy variable

"David A. Katz, "Faculty Salaries, Promotions and Productivity at a Large University," The American Economic Review. June 1973, pp. 469-477.



Explanatory Variable

Coefficient

Standard Error

Books,

7.21

Articles, A

5.37

Excellent articles, E

13.43

Dissertations, F

66.85

Public Service, P

5.65

Committees,

10.02

Experience, Y

126.92

Teacher rating,

0.01

English professors,

-2293

18.75

Female, X

-2410

20.80

Ph.D. degree, R

1919

10.01

Constant = 11,155; /?2 = 0.68; n = 596 Standard error of regression = 2946 Mean of the dependent variable = 15,679 SD of the dependent variable = 5093

(e) The variable F is the number of dissertations supervised since 1964. is the number of books published. How do you explain the high coefficient for F relative to that of B?

(f) Would you conclude from the coefficient of X that there is sex discrimination?

(g) Compute the partial for experience.

5. Estimate demand for food functions on the basis of the data in Table 4.9. Discuss if there is a multicoUinearity problem and what you are going to do about it.

6. Estimate demand for gasoline on the basis of the data in Table 4.8. Are the wrong signs for Pg a consequence of multicoUinearity?

Appendix to Chapter 7

Linearly Dependent Explanatory Variables

In Chapter 4 we assumed that the explanatory variables were lineariy independent and hence that (XX) exists. What happens if the explanatory variables are linearly dependent? This is the case of perfect multicoUinearity. In this case

indicating whether or not the professor ranked in the top 50% of all instructors by a vote of the students, (d) Would dropping the variable T from the equation change the signs of any of the other variables?



(XX) =

5 0 5 0 5 5 5 5 10

Take <x = i (first column). Then a = (1, 0, 1) and aP = p, + - Hence Pi + is estimable. If we take a = 5 (second column). Then a = (0, 1, 1) and aP = P2 + - Hence p2 + p, is estimable. Can we estimate p, 4- p2 4- p,? No, because we cannot get a = (1, 1, 1) by taking any Unear combination of the columns of (XX).

(XX) will be a singular matrix (its rank will be less than k). Hence we do not have a unique solution to the normal equations. However, consider two different solutions. Pi and 02. to the normal equations. We then have

(XX)P, = Xy

(XX)P2 = Xy

Premultiply the first equation by Pj and the second equation by p; and subtract. Since PiXXPi = p;XXP2 (the transpose of a scalar is the same), we get the result that PJXy = PjXy; that is, the regression sum of squares is the same. Hence the residual sum of squares will be the same whatever solution we take.

If (XX) is singular, it means that not all the regression parameters p, are estimable, but only certain linear functions of the p, are estimable. The question is: What linear functions are estimable?

Let a be a A: X 1 vector that is a linear combination of the columns of (XX). Thus a = (XX)X. Then the linear function aP is uniquely estimable. To see this consider any solution p of the normal equations. Then

aP = XXXP = XXy

Thus aP is a unique linear function of y. Since E(XXy) = XXXp = aP, we get the result that aP is a unique unbiased linear estimator of ap. It can also be shown that it has minimum variance among all linear unbiased estimators (the proof is similar to the one in the Appendix to Chapter 4). Thus it is BLUE.

In case (XX) is nonsingular, every x 1 vector can be expressed as a linear combination of the columns of (XX). Thus a Hnear functions aP are uniquely estimable. Hence all the p, are uniquely estimable. As an illustration of perfect multicollinearity, consider the example in Section 7.2, where Xj = X + Xj. In this case we have

= pX + p2X2 + P3X3 + M

= (Pi + p3)X + (p2 + ) 2 +

Thus we can see that only p, -I- and P2 -I- are estimable. In this case we have



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [ 102 ] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]