back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [ 159 ] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


159

11.7 SOME OTHER PROBLEMS 469

The Case of Multiple Equations

The case of multiple equations can be illustrated as follows. Suppose that

y, = , + e, 2 = 2 : + ej

X is not observed. Instead, we observe X = x + u. Suppose that e,, tj, and are mutually uncorrelated and also uncorrelated with x. Also, let var(e) = a], varCj) = CTj, var(M) = ;„ and var(jc) = aJ. Then we have

var(y,)

= W. + 0]

varCVs)

cov(y,, 2)

covCy,, X]

= P.c

( 2, X)

= P2fT?

\ { )

These six equations can be solved to get estimates of p,, o?, af„ ct,, and . Specifically, we have

. cov(y., 2) cov(y20:)

( , 2) cov(A,y,)

In effect, this is like using yj as an instrumental variable in the equation y, = yX + w, and y, as an instrumental variable in the equation yj = + -Further elaboration of this approach of solving the errors-in-variables (and unobservable variables) problem by increasing the number of equations can be found in Goldberger.-

As an illustration, suppose that

y, = expenditures on automobiles 2 = expenditures on other durables

x = permanent income

X = measured income

If we are given only y, and X (or and X), we are in the single-equation errors-in-variables problem and we cannot get consistent estimators for p, (or p,). But if we are given y„ y, and X, we can get consistent estimators for both p, and Pj.

A. S Goldberger, "Structural Equation Methods in the Social Sciences," Econometrica, 1972, pp 979-1002.



plim fi =

cov(y, X) co-vjx + u), (y + v) var{X) \ar{x + u)

o-vv + 2a„, + CT,„,

ct„ + 2ct„, + cr,,,

Since ct,„ = Pct„, and a„ = cov[(y - e)lfi, v] = ujfi, we have

a P(a„ + gj + ujfi +

phm fi = -r-

+ 2o-„, + CT„„

Now even if there is no error in (i.e., = 0), we find that fi¥-fi since a„ 0. Thus it is not just errors in that create a problem as in the earlier case.

One can calculate the nature of the bias in fi making different assumptions about the different covariances. We need not pursue this further here. What is important to note is that one can get either underestimation or overestimation of fi. With economic data where such correlations are more the rule than an exception, it is important not to believe that the slope coefficients are always underestimated in the presence of errors in observations, as is suggested by the classical analysis of errors-in-variables models.

We have all along omitted the intercept term. If there is an intercept term a, i.e., our true relationship is = a + + f, and instead we estimate Y = a + fix + w, then the least squares estimator fi underestimates fi and consequently the least squares estimator a will overestimate a. If, however, the errors do not have a zero mean [i.e.. A = + and E{u) 0], these conclusions need not hold.

Summary

1. In the single-equation model with a single explanatory variable that is measured with error, the least squares estimator of fi underestimates the true fi. Specifically, the bias is -fiX, where A is the proportion of the error variance

Correlated Errors

Until now we have assumed that the errors of observation are mutually uncorrelated and also uncorrelated with the systematic parts. If we drop these assumptions, things will get more complicated. For example, consider the model = fix + e.

The observed values are A = + and Y = + v, where and v are the measurement errors. Let „ denote the covariance between x and y, with a similar notation for all the other covariances. If the least squares estimate of fi from a regression of F on A is fi, then



in the variance of x. This result is based on the assumption that the etrors have zero means and have zero covariance with the systematic parts and among themselves.

2. We can obtain bounds for the true coefficient p by computing the regression coefficient of on jc and the reciprocal of the regression coefficient of : on (Section 11.2).

3. In a model with two explanatory variables jc, and jcj with coefficients of P, and P2 where only jc, is measured with error, we can show that the bias in the estimator of p, is (-p,X.)/(l - p), where p is the correlation between the measured values of , and jCj. Also, the bias in the estimator of P2 = - p (the bias in the estimator of p,). Similar results can be derived when there are many explanatory variables. Some papers in the literature derive the expressions for the bias in terms of the correlations between the true unobserved variables. These expressions are not very useful in practice. Here we derive the expressions in terms of the correlations of the observed variables. The only unknown factor is \, the proportion of error variance in the variance of the error-ridden variable (Section 11.3).

4. As with the model with a single explanatory variable, we can derive bounds for the true coefficients by running two regressions. These bounds are given in equations (11.6) and (11.7). However, these bounds are not comparable to confidence intervals. The estimated bounds themselves have standard errors. We have illustrated with an example that these bounds can sometimes, be so wide as to be almost useless. In many problems, therefore, it is better to supplement them with estimates based on some plausible assumptions about the error variances.

5. In the model with two explanatory variables, if both the variables are measured with error, the direction of biases in the OLS estimators cannot be easily evaluated [see equations (11.9) and (11.10)]. Making the most general assumptions about error variances often leads to wide bounds for the parameters, thus making no inference possible. On the other hand, making some plausible assumptions about the error variances, one can get more reasonable bounds for the parameters. This point is illustrated with an example.

6. In the application of the errors-in-variables model to problems of discrimination, the "reverse regression" has often been advocated. The arguments for and against this procedure are reviewed in Section 11.4.

7. One method for obtaining consistent estimators for the parameters in errors-in-variables models is the instrumental variable method. In practice it is rather hard to find valid instrumental variables. In time-series data lagged values of the measured x, are often used as instrumental variables. Some grouping methods are often suggested for the estimation of errors-in-variables models. These methods can be viewed as instrumental variable methods. However, their use is not recommended. Except in special cases, the estimators they yield are not consistent.

8. Often in econometric work it is customary to use some surrogate variables for the variables we cannot measure. These surrogate variables are called proxy variables. In an equation like



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [ 159 ] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]