back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [ 171 ] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


171

12.10 Hausmans Specification Error Test

Hausmans specification error test" is a general and widely used test for testing the hypothesis of no misspecification in the model.

Let o denote the null hypothesis that there is no misspecification and let , denote the alternative hypothesis that there is a misspecification (of a particular type). For instance, if we consider the regression model

= px + M (12.20)

"This is the criterion proposed in Peter Schmidt, "Choosing Among Alternative Linear Regression Models," Atlantic Economic Journal, 1974, pp. 7-13.

"E. E. Leamer, "Model Choice and Sepcification Analysis," in Z. GriUches and M. D. Intril-ligator(eds.). Handbook of Econometrics, Vol. I (Amsterdam: North-Holland, 1983), Chap. 5, pp. 285-330.

"J. A. Hausman, "Specification Tests in Econometrics," Econometrica, Vol. 46, No. 6, November 1978, pp. I25I-127I.

gested as a criterion for model choice. Note that at each stage {n - I) observation are within sample (i.e., used for estimation) and the remaining observation is out-of-sample (i.e., used for prediction). PRESS is the sum of squares of out-of-sample prediction errors.

Instead of using PRESS (the sum of squares of predicted residuals) as a criterion for model choice, we can consider the sum of squares of studentized residuals as a criterion of model choice." As discussed in Section 12.4, the studentized residual is just the predicted residual divided by its standard error. The sum of squares of studentized residuals can be denoted by SSSR and the sum of squares of predicted residuals as SSPR. This would be a better terminology, in keeping with the use of the term "residual" for an "estimated error." From the derivations in Section 12.3 we note that both SSPR and SSSR are weighted sums of squares of the least squares residuals u,. The criterion also involves minimizing a weighted sum of squares of least squares residuals (we minimize 2 wf/d.f.). To be a valid criterion of model choice we should be able to show that the expected value of the quantity minimized is less for the true model than for the alternative models. If this is not the case, the criterion does not consistently select the true model. Leamer" shows that the R~ criterion, and the SSSR criterion, which minimizes sum of squares of studentized residuals (this is Schmidts SSPE criterion), satisfy this test and are valid criteria, but that the PRESS criterion and other criteria suggested in cross-validation are not valid by this test. Thus if one is interested in using predicted residuals for model choice, the best procedure appears to be not one of splitting the sample into two parts but to derive studentized residuals (the SAS regression program gives them) and consider minimization of the sum of squares of studentized residuals (SSSR) as a criterion of model choice (i.e., use Schmidts SSPE criterion).



as a with 1 d.f. to test against ,. This is an asymptotic test.

We have considered only a single parameter p. In the general case where P is a vector of parameters, V, and Vp will be matrices, P,, po, and q will all be vectors and the Hausman test statistic is

m = )\~

which has (asymptotically) a x-distribution with degrees of freedom k.

Since a consideration of the A-parameter case involves vectors and matrices, we discuss the single-parameter case. The derivations in the A-parameter case are all similar.

To prove the result var(4) = V, - Vo, we first have to prove the resuh that

cov(po, ) = 0

The proof proceeds as follows. Under , both po and p, are consistent estimates for p. Hence we get

plim q = plim p, - plim Po = P - p = 0

Consider a new estimator for p defined by

a = Po + \<?

where \ is any constant. Then plim = p. Thus is a consistent estimator of P for all values of X.

V(a) = Fo + var() + 2X cov(Po, q)V

Since Po is efficient. Thus

X- var() + 2X cov(po, g) s 0 (12.21)

in order to use the OLS procedure, we specify that x is independent of u. Thus the null and alternative hypotheses are:

Hf. X and are independent ,: X and are not independent

To implement Hausmans test, we have to construct two estimators and which have the following properties:

Po is consistent and efficient under but is not consistent under ,. , is consistent under both and but is not efficient under .

Then we consider the difference q = fi\ - Po- Hausman first shows that

var() = V, - Fo

where V, = var(p,) and Vo = var(Po), both variances being computed under - Let Viq) be a consistent estimate of var(g). Then we use



Ixz " S

V, = var(p,) =

(S xzY

P, is consistent under both and It is, however, less efficient than Po, under . Defining 4 = Pi - Po, we have

Iz 1

var() = y, - Vo =

.(E xzf 2 x\

for all values of \. We will show that the relationship (12.21) can be satisfied for all values of \ only if cov(Po g) = 0.

Suppose that cov(P(, > 0. Then by choosing \ negative and equal to -cov(Po, )/var(), we can show that the relationship (12.21) is violated. Thus cov(Po. is not greater than zero.

Similarly, suppose that cov(Po, < 0. Then by choosing \ positive and equal to -cov(Po, g)/var(), we show that the relationship (12.21) is violated. Thus cov(P(, cannot be greater than or less than zero. Hence we get cov (Po, .

Now since Pi = Po + <7 and cov(Po, = 0, we get

var(p,) = var(Po) + var(<?)

var(4) = var(p,) - var(po) = V, - Vo which is the result on which Hausmans test is based.

An Application: Testing for Errors in Variables or Exogeneity

Consider now the model given by equation (12.20). The model can be regarded as an errors-in-variables model (Chapter 11), where x is correlated with the error term because it is an error-ridden explanatory variable. In this case our interest is in testing whether there is an error in this variable or not.

Alternatively, equation (12.20) can be regarded as one equation in a simultaneous equations model (Chapter 9) and x is correlated with because it is an endogenous variable. We are interested in testing whether x is exogenous or endogenous. If x is not correlated with u, we are justified in estimating the equation by OLS.

Under Po is not consistent. To get a consistent estimator for p we have to use the instrument variable (IV) method. Let us denote the instrumental variable by z- Then the IV estimator is

P,= = P4-"



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [ 171 ] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]