back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [ 156 ] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


156

Suppose we adopt the notation that

X2 =

I for men 0 for women

Then p, > 0 implies that men are paid more than women with the same qualifications and thus there is sex discrimination. A direct least squares estimation of equation (11.13) with A", substituted for ,, and p, > 0 has been frequently used as evidence of sex discrimination. In the reverse regression

1 = -yi + - + » (11.14)

we are asking whether men are more or less qualified than women having the same salaries. The proponents of the reverse regression argue that to establish discrimination, one has to have 72 < 0 in equation (11.14); that is, among men and women receiving equal salaries, the men possess lower qualifications.

The evidence from the reverse regression has been mixed. In some cases 72 < 0 but not significant and in some others 72 > 0. Conway and Roberts consider data for 274 employees of a Chicago bank in 1976. In their analysis = log salary and they get p, = 0.148 (standard error 0.036), thus indicating that men are overpaid by about 15% compared with women with the same qualifications. In the reverse regression they get 7 = -0.0097 (standard error 0.0202), thus showing no evidence of discrimination one way or the other. In another study by Abowd, Abowd, and Killingsworth," who compare wages for whites and several ethnic groups, the direct regression gave p2 > 0 and the

"Dclores A. Conway and Harry V. Roberts. "Reverse Regression. Fairness and Employment Discrimination," Journal of Business and Economic Statistics, Vol. I, January 1983, pp. 75-85.

"A. M. Abowd. J. M. Abowd, and M. R. Killingsworth. "Race, Spanish Origin and Earnings DifTerentials Among Men: The Demise of Two Stylized Facts." NORC Discussion Paper 83-n. The University of Chicago. Chicago, May 1983.

ables, both regressions need to be computed and whether reverse regression alone gives the correct estimates depends on the assumptions one makes.

The usual model, in its simplest form, is thai of two explanatory variables, one of which is measured with error.

= p,jr, + + (11.13)

where = salary

, = true qualifications

2 = gender (in sex discrimination)

= race (in race discrimination)

What we are interested in is the coefficient of Xj. The problem is that x, is measured with error. Let

Xi = measured qualifications A", = , + V



indirect regression gave 7; > 0 (indicating that whites are disfavored). Thus the direct regression showed discrimination and the reverse regression showed reverse discrimination.

Of course, in all these studies, there is no single measure of qualifications and -v, is a set of variables rather than a single variable. In this case what they do in reverse regression is take the estimated coefficients from the direct regression and take the linear combination of these variables based on these estimated coefficients as the dependent variable and regress it on and ,. Since the direct regression gives biased estimates of these coefficients, what we have here is a biased index of qualifications.

The usual errors in variables result in equations (11.4) and (11.5) show that one should not make inferences on the basis of p, and -y, but obtain bounds for p,. from the direct regression and reverse regression estimates. As shown in equation (11.7), these bounds depend on the sign of pp,, where p = correlation between X, and ,. Normally, one would expect p > 0 and P, > 0 and hence we have

plim fe < Pj < plim fe

where p, is the implied estimate of P; from the reverse regression.

Note also from equations (11.4) and (11.5) that the (asymptotic) biases depend on two factors: X = aJ/var(A,) and p = correlation between A", and Xj. X is unknown but p can be computed from the data. Thus one can generate different estimates of p from these equations based on different assumptions about the value of . We, will not, however, undertake this exercise here.

11.5 Instrumental Variable Methods

Consider equation (11.2). The reason we cannot use OLS is because the error w, is correlated with x,. The instrumental variable method consists of finding a variable z, that is uncorrelated with tv, but correlated with x„ and estimating p by Piv = 2 >.z,/2 x,z,. The variable z, is called an "instrumental variable."

Note<that in the usual regression model = : + -, the normal equation for the OLS estimation of p is

::x(y-.)0 (11.15)

This is the sample analog of the assumption we make that co\(x, u) = 0. If this assumption is violated, we cannot use the normal equation (11.15). However, if we have a variable z such that cov(z, w) = 0, we replace the normal equation (11.15) by

ft. . .:-



= + plim 2 w,z!j X -«

cov(z cov(z. X) *

since cov(z, >> = 0 and cov(z. ) # 0. The reason we want z to be uncorrelated with - but correlated with x is that we want cov(z, ) = 0 but cov (z, ) 7 0. It is often suggested that z is a "good" instrument if it is highly correlated with X.

In practice it is rather hard to find vaUd instrument variables. Usually, the instrumental variables are some variables that are "around," that is, whose data are available but do not belong in the equation. An illustration of this is the study by Griliches and Mason* who estimate an earnings function of the form

= a + fis + ya + bx +

where is the log wages, s the schooling, a the ability, the other variables, and the error term. They substituted an observed test score i for the unobserved ability variable and assumed that it was measured with a random error. They then used a set of instrumental variables such as parental status, regions of origin, and so on. The crucial assumption is that these variables do not belong in the earnings function explicitly, that is, they enter only through their influence on the ability variable. Some such assumption is often needed to justify the use of instrumental variables.

In the Griliches-Mason study the OLS estimation (the coefficients of the "other variables" are not reported) gave

= const. + O.I982(race) + 0.0331 (schooling) 4 0.00298(test score)

(0 0458) (0 0067) (OO0OJ8)

(Figures in parentheses are standard errors.) The IV estimation gave

= const. + 0.0730(race) + 0.0483(schooling) + 0.00889(test score)

(0.0468) (O.O065) (0 00078)

The IV estimation thus gave a much higher estimate of the ability coefficient and lower estimate of the race coefficient.

In the case of time series data, lagged values of the measured X, are often used as instrumental variables. If A", = , + , and the measurement errors u,

"Z. Griliches and W. M. Mason, "Education. Income and Ability," Journal of Political Economy. Vol. 80. No. 3, Part 2, May 1972. pp. S74-SI03.

which is the instrumental variable (IV) estimator. We can show that the IV estimator is consistent.

... .• S (1 +

phm Piv = plim



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [ 156 ] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]