back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [ 60 ] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


60

"E. E. Leamer, "A Result on the Sign of the Restricted Lease Squares Estimates," Journal of Econometrics, Vol. 3, 1975, pp. 387-390.

less than y/F, cannot be increased by discarding independent variables at a time.

However, if there are or more independent variables with absolute r-values less than y/F, the F-ratio may or may not be less than 1, and hence we may or may not be able to increase by discarding these variables. But if the is increased, the variables to be discarded must come from the set of independent variables with absolute /-values less than Vk.

As an illustration, consider the case = 7. Since V? = 2.6, if the F ratio is less than I, all we can say about the /-ratios is that the /-ratios are less than 2.6. But this means that we can have all the /-ratios significant and yet have the R rise by dropping all the variables.

As yet another example, consider a regression equation with five independent variables and /-ratios of 1.2, 1.5, 1.6, 2.3, and 2.7. Note that VZ = 1.414, VI = 1.732, and V5 = 2.236. We consider it = 1, 2, 3, 4, 5 and check whether there are /-ratios < VF. We note that this is the case for only = 3. Thus if we can increase J at all, it is by dropping the three variables jc,, Xj, JCj. Thus all we have to do is to run the regression with these three variables excluded and check whether R has increased.

The point in all this discussion is that in multiple regression equations one has to be careful in drawing conclusions from individual /-ratios. In particular, this is so for analyzing the effect on R of deletion or addition of sets of variables, often, in applied work it is customary to run a "stepwise" regression where explanatory variables are entered into an equation sequentially (in order determined by the maximum partial at each stage), and to stop at a point where R stops increasing. What the previous discussion shows is that it might be possible to increase R by introducing a set of variables together.

Thus there are problems with maximizing R. But if one is going to do it, the relationship between /- and F-ratios we have discussed will be of some help. The rationale behind the maximization of R is discussed in Chapter 12.

A Cautionaiy Note on the Omission of Nonsignificant Variables: Finally, there is one other result that needs to be noted regarding the procedure of deleting variables whose coefficients are not "significant." Often researchers are perturbed by some "wrong" signs for some of the coefficients. In an effort to obtain hopefully "right" signs, statistically insignificant variables are dropped. Surprisingly enough, there can be no change in the sign of any coefficient that is more significant than the coefficient of the omitted variable. Learner" shows that the constrained least squares estimate of must lie between (p, -tSp 0y + /Sy), where

Pi, = unconstrained estimate of p, Sj = standard error of 0, t = absolute /-value for the deleted variable



We will not go through the proof here. It can be found in Learners article. The result enables us to predict the sign changes in the coefficients of the retained variables when one of the variables is deleted.

As an illustration, consider the problem of estimation of the demand for Cey-lonese tea in the United States. This example is discussed in Rao and Miller. The following is the list of variables used.

Tea = demand for Ceylonese tea in the United States

Y = disposable income

Pc - price of Ceylonese tea

Pb = price of Brazilian coffee, considered a substitute

Y, Pc, and Pg are deflated by the price of food commodities in the United States. All equations are estimated in log-linear form. The results are

log Tea = 3.95 + 0.14 log Pb + 0.75 log Y + 0.05 log Pc

(1.99) (0.14) (0.24) (0.41)

(Figures in parentheses are standard errors.) The coefficient of log P has the wrong sign, although it is not significantly different from zero. However, dropping the variable Pg, we get

log Tea = 3.22 -I- 0.67 log Y + 0.04 log Pc

(2.02) (0.25) (0.42)

Another alternative is to drop the variable log P arguing that the demand for Ceylonese tea is price inelastic. This procedure gives us the result

, log Tea = 3.73 + 0.14 log Pg + 0.73 log Y

(0.71) (0.13) fO 14)

However, the correct solution to the problem of a wrong sign for log Pc is neither to drop that variable nor to drop log Pb but to see whether some other relevant variables have been omitted. In this case the inclusion of the variable

Pi = price of Indian tea which is a close substitute for Ceylonese tea

produces more meaningful results. The results are now

log Tea = 2.84 -t- 0.19 log Pg -t- 0.26 log Y - 1.48 log Pc + 1.18 log P,

(2.00) (0 13) (0.37) (0.98) (0.69)

Note the coefficient of log Pc is now negative and the income elasticity has dropped considerably (from 0.73 to 0.26), and is not significant.

Of course, in this case the variable P, should have been included in the first place rather than as an afterthought. The deletion of log Y from the last equation will not change the signs of any of the other coefficients by Leamers rule. The resulting equation is

log Tea = 1.85 + 0.20 log Pg - 2.10 log Pc + 1.56 log P,

(t.39) (0.13) (0.39) (0.42)

Now log Pc and log P are both significant, and have the correct signs.

"Potluri Rao and Roger Miller, Applied Econometrics (Belmont, Calif.: Wadsworth, 1971), pp. 38-0.



The Analysis-of-Variance Test

Suppose that we have two independent sets of data with sample sizes at, and «2, respectively. The regression equation is

= a, 4- ,, :, + p,2JC2 + •••-!- fiij,x + for the first set

= a2 + P2iX\ + P222 + • • • + + " for the second set

For the ps the first subscript denotes the data set and the second subscript denotes the variable. A test for stability of the parameters between the populations that generated the two data sets is a test of the hypothesis:

-o: Pu = 21, P12 = 22, • • > Pu = P2/1, «1 = «2

If this hypothesis is true, we can estimate a single equation for the data sel obtained by pooling the two data sets.

The F-test we use is the F-test described in Secction 4.8 based on URSS and RRSS. To get the unrestricted residual sum of squares we estimate the regression model for each of the data sets separately. Define

RSS, = residual sum of squares for the first data set RSS2 = residual sum of squares for the second data set RSS,

RSS2

has a x-distribution with d.f. - - 1)

has a x-distribution with d.f. (rij - - 1)

Since the two data sets are independent (RSS, -I- RSS2)/o-2 has a x" distribution with d.f. (at, + 2 -2k - 2). We wil! denote (RSS, + RSS2) by URSS. The restricted residual sum of squares RRSS is obtained from the regression with the pooled data. (This imposes the restriction that the parameters are the same.) Thus RRSS/o has a x-distribution with d.f. = (at, + 2) -k- I.

4.11 Tests for Stability

When we estimate a multiple regression equation and use it for predictions at future points of time we assume that the parameters are constant over the entire time period of estimation and prediction. To test this hypothesis of parameter constancy (or stability) some tests have been proposed. These tests can be described as:

1. Analysis-of-variance tests.

2. Predictive tests.



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [ 60 ] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]