back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [ 52 ] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


52

n cov(y,, W,) = 5,,

Hence we get

n covCy,, W,) 5225y - 5252y nV(W,) 5„522 - 5?2

which is the expression for 0, we got in (4.9).

Suppose that we eliminate the effect of x, on as well. Let F, be the residual from a regression of on x,. If we now regress V, on W„ then the regression coefficient we get will be the same as that obtained from a regression of y, on Wj. This is because

V; = y,. - - byixv - x-d

where b2 is the regression coefficient of on x,. However, Xj will be uncorrelated with W (a residual is uncorrelated with a regressor). Hence

cov(V, W) = co\{y, W)

Thus a regression of V, on W, will produce the same estimate 0, as a regression of y, on W,. Of course, the standard errors of p, from the two regressions will be different because

var(V) < \ { )

This result is important and useful in trend elimination and seasonal adjustment of time-series data. What it implies is that if we have an explained variable and an explanatory variable x and there is another nuisance variable Z that is influencing both and x, the "pure" effect of x on after eliminating the effect of this nuisance variable Z on both x and can be estimated simply by estimating the multiple regression equation

Note that W, is that part of x, that is left after removing the effect of jc, on jc,.

Step 2. Now regress y, on W,. The regression coefficient is nothing but p,, which we derived eariier in the multiple regression [see (4.9)]. To see this, note that the regression coefficient of y, on W, is nothing but cov(y„ VV,)/ var(Vy,). From (4.11) we have (since 2 = 0)

but b,2 = 5,2/522, hence nV(W,) = 5„ - 5522. Also, n cov(),, W,) = S J/W,

= 5,y - fc,252y

Now substitute b,2 = 5,2/522. We get, on simplification,

S\iSiy



t] + in- 3)

where = p,/SE(p,). The derivation of this formula is also omitted here since it is somewhat tedious.

The interpretation of P2 and the formulas associated with it are similar. Thus we have

tl + in - 3)

where /2 = P2/SE(P2).

Illustrative Example

In the illustrative example in Section 4.3 we have

1 = = 6.863 t2 = = I.96I

Hence

0.102 0.102

(6.863)2 47 0

2 (6.863)2 + 20 67.10

(1-961)2 3.846

2 (1.961)2 + 20 23.846

"For a general model this is proved in the appendix to this chapter under the title "Prior Adjustment."

The proof can easily be constructed using the result in G. S. Maddala, Econometrics (New York: McGraw-Hill, 1977), p. 462.

The coefficient (3 gives us the "pure" effect needed. We do not have to run a regression of on Z and x on Z to eliminate the effect of Z on these variables and then a third regression of "purified" on "purified" x.

Although we have proved the result for only two variables, the result is general. X and Z can both be sets of variables (instead of single variables)."

Finally, the standard error of 0, obtained from the estimation of the multiple regression (4.1) is also the same as the standard error of the regression coefficient obtained from the simple regression of y, on W,. This result is somewhat tedious to prove here, so we will omit the proof.

The correlation coefficient between and W is called the partial correlation between and x,. It is partial correlation in the sense that the effect of Xj has been removed. It will be denoted by r,,. The subscript after the dot denotes the variable whose effect has been removed. Corresponding to the relationship we have given in Chapter 3 (end of Section 3.6), we have the formula



" tl + d.f.

The degrees of freedom d.f. = (number of observations) - (number of regression parameters estimated) = ( - 3) in this case. The variables to be included

We are using the term coefficient of determination to denote the square of the coefficient of correlation.

The General Case

In the general case of k regressors we define /, = p,/SE(0,) for / = 1,2, . . . , k. Then the partial for the variable x, is

r?, • other jcs = -r-, ,--

/? + (n - A: - 1)

4.5 Partial Correlations and Multiple Correlation

If we have an explained variable and three explanatory variables x, Xj, x and rt, r2> >3 are the squares of the simple correlations between and x,, Xj, Xj respectively, then /f,, rj,, and /fj measure the proportion of the variance in that X, alone, x, a/o«e, or x, alone explain. On the other hand, Rl

123 rneasures

the proportion of the variance of that Xj, , Xj together explain. We would also like to measure something else. For instance, how much does Xj explain after Xj is included in the regression equation? How much does Xj explain after X, and X2 are included? These are measured by the partial coefficients of determination rjj I and rjj ,2 respectively. The variables after the dot are the variables already included. With three explanatory variables we have the following partial correlations: r,,, 2, ri 3, 21, /,2- > " nd , 2- These are called partial correlations of the first order We also have three partial correlation coefficients of the second order: r,, 23, r,,213> and „ ,2. The variables after the dot are always the variables already included in the regression equation. The order of the partial correlation coefficient depends on the number of variables after the dot. The usual convention is to denote simple and partial correlations by a small r and multiple correlations by capital R. For instance, R] ,2, Rl , and Rl ,23 are all coefficients of multiple determination (their positive square roots are multiple correlation coefficients).

How do we compute the partial correlation coefficients? For this we use the relationship between r- and t-. For example, to compute rjj 3 we have to consider the multiple regression of on X2 and X3. Let the estimated regression equation be

= a - 022 + Let t2 = P2/SE(P2) from this equation. Then

rlj -



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [ 52 ] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]