back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [ 28 ] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


28

and v,e are interested in determining if there is sex discrimination in salaries, we can ask:

1. Whether men and women with the same quahhcations (value ot jc) are getting the same salaries (value of >). This question is answered by the direct regression, regression of on jc. Alternatively, we can ask:

2. Whether men and women with the same salaries (value of y) have the same qualifications (value of \) This question is answered by the reverse regression, regression of x on y.

In this example both the questions make sense and hence we have to look at both these regressions. For the reverse regression the regression equation can be written as

x, = a + , + V,

where v, are the errors that satisfy assumptions similar to those stated earUei in Section 3.2 for u,. Interchanging x and in the formulas thai we derived, we get

0 = and d = jc - 0y Denoting the residual sum of squares in this case by RSS, we have

RSS = 5„ -

Note that

Hence if is close to 1, the two regression lines will be close to each other. We now illustrate with an example.

Illustrative Example

Consider the data in Table 3.2. The data are for 10 workers

X = labor-hours of work = output

We wish to determine the relationship between output and labor-hours of work We have

We discuss this problem further in Chapter 11



Observation

"

Total

S„ = 668 -

msf --

= 668 - 640

= 28

= 789 -

10(8X9.6) =

= 789 -

768 =

Syy = 952 -

9.

952 - 921.6 =

30.4

a = ji - 0x = 9.6 - 0.75(0.8) = 3.6 Hence the regression of on is

= 3.6 + 0.75*

Since X is labor-hours of work and is output, the slope coefficient 0.75 measures the marginal productivity of labor. As for the intercept 3.6, it means that output will be 3.6 when labor-hours of work is zero! Clearly, this does not make sense. However, this merely illustrates the point we made earlier-that we should not try to get predicted values of too far out of the range of sample values. Here ranges from 5 to 10. As for the reverse regression we have

a = X - = 8.0 - 9.6(0.69) = 1.37 Hence the regression of x on is

X = 1.37 + 0.69y

Table 3.2



Note that the product of the two slopes = pp = 0.75(0.69) = 0.52 = rl,. These two regression lines are presented in Figure 3.3. The procedure used in the two regressions is illustrated in Figure 3.4. If we consider a scatter diagram of the observations, the procedure of minimizing

i (y, - a - px,)2

1= I

amounts to passing a line through the observations so as to minimize the sum of squares of the vertical distances of the points in the scatter diagram from the line. This is shown in Figure 3.4(a). The line shows the regression of on X.

On the other hand, the procedure of minimizing

i ( , - a - Py.Y

amounts to passing a line through the observations so as to minimize the sum of squares of the horizontal distances of the points in the scatter diagram from the line. This is shown in Figure 3.4(b). The line shows the regression of X on y.

We can also think of passing a line in such a way that we minimize the sum of squares of the perpendicular distances of the points from the line. This is called orthogonal regression.

Since a discussion of orthogonal regression is beyond the scope of this book, we will confine our discussion to only the other two regressions: regression of

Figure 3.3. Regression lines for regression of on x and x on y.



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [ 28 ] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]