back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [ 163 ] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


163

Ui =

Since V{u) = (1 - / „) , we have

V{ud =

1 - h„ 1 - h.

Thus the predicted residuals are also heteroskedastic. It has also been proved that the predicted residuals have the same correlation structure as the least squares residuals.

Although the predicted residuals have properties similar to the least squares residuals, some statisticians have found them more useful than the least squares residuals in problems of choosing between different regression models." The criterion they use is that of the predicted residual sum of squares (PRESS), which is defined as

PRESS = 2 («/)

The more common criterion used is (the rationale for this is discussed in Section 12.6)

RSS = S «?

Since u*i = M,/(l - h,,), from the definition of h„ we note that PRESS as a criterion for selection of regression models results in a preference for models that fit relatively well at remote values of the explanatory variables.

The predicted residuals can be computed by using the dummy variable method described in Chapter 8 (see Section 8.6). We will describe it after discussing studentized residuals, because the two are closely related.

Studentized Residuals

The studentized residual is just the predicted residual divided by its standard error. Thus if we are using the dummy variable method to get the rth studentized residual, we do the following. Estimate the regression equation with an extra variable D defined as

1 for the rth observation [0 for aU others

We will not be concerned with the proof here. Proofs can be found in Cook and Weisberg, Residuals.

"See, for instance, R. L. Anderson, D. M. Allen, and F. Cady, "Selection of Predictor Variables in Multiple Linear Regression," in T. A. Bancroft (ed.). Statistical Papers in Honor of George W. Snedecor (Ames, Iowa: Iowa State University Press, 1972). Also N. T. Quan, "The Prediction Sum of Squares as a General Measure for Regression Diagnostics," Journal of Business and Economic Statistics, Vol. 6, 1988, pp. 501-504.

There is a simple relationship between the least squares residuals , and predicted residuals Uj. This is



BLUS Residuals

The BLUS (which stands for "best linear unbiased scalar") residuals are constructed from the least squares residuals so that they have the same properties as the errors u,; that is, they have zero mean (they are unbiased), are uncorrelated, and have the same variance u as the errors u,.

The computation of BLUS residuals is too complicated to be described here. However, we need not be concerned with this because it has been found that in tests for heteroskedasticity and autocorrelation, there is not much to be gained by using the BLUS residuals as compared with the least squares residuals. Hence we will not discuss the BLUS residuals further.

Recursive Residuals

Recursive residuals have been suggested by Brown, Durbin, and Evans, for testing the stability of regression relationships. However, these residuals can be used for other problems as well, such as tests for autocorrelation and tests

"H. Theil, "The Analysis of Disturbances in Regression Analysis," Journal of the American Statistical Association, Vol. 60, 1965, pp. 1067-1079. -Cook and Weisberg, Residuals, p. 35.

"R. L. Brown, J. Durbin, and J. M. Evans, "Techniques for Testing the Constancy of Regression Relationships" (with discussion). Journal of the Royal Statistical Society, Series B, Vol. 37, 1975. pp. 149-163. This paper gives algorithms for the construction of recursive residuals. Farebrother gives algorithms for the construction of BLUS and recursive residuals. See R. W. Farebrother, "BLUS Residuals: Algorithm A5104" and "Recursive Residuals: A Remark on Algorithm A75: Basic Procedures for Large, Sparse or Weighted Least Squares Problems," Applied Statistics, Vol. 25, 1976, pp, 317-319 and 323-324.

Then the estimate of the coefficient of D is the predicted residual and the f-ratio for this coefficient is the studentized residual. Thus to generate predicted and studentized residuals, the regressions involve all the observations in the sample and we create dummy variables in succession for the first, second, third, . . . observations. Studentized residuals are usually used in the detection of outliers. Suppose that there is an outlier. In the case of the least squares residual, it might not be detected because we use the outlier as well in computing the regression equation. In the case of predicted and studentized residuals, we use all the other observations in computing the regression equation and try to use it in predicting this particular (outlying) observation. Thus there is a better chance of detecting outliers with this method.

The least squares and the predicted residuals both suffer from two problems. They are correlated and heteroskedastic even if the errors u, are uncorrelated and have the same variance. There have been several methods suggested for the construction of residuals that do not have these shortcomings. We discuss only two of them: the BLUS residuals suggested by Theil" and recursive residuals.



«, + 2 =

1 + 2

We continue this process until we get to the last observation. If we have explanatory variables, since we have to estimate + 1 parameters (including the constant term) and obtain their variances, we need at least (k + 2) observations. Thus the recursive residuals start with the observation {k 4- 3) and we have T - - 2 recursive residuals. The recursive residuals have been shown to have the following properties."

1. They are uncorrelated.

2. They have a common variance .

"G. D. A. Phillips and A. C. Harvey, "A Simple Test for Serial Correlation in Regression Analysis," Journal of the American Statistical Association. Vol. 69, 1974, pp. 935-939. Proofs are omitted. These properties are proved in Brown, Durbin, and Evans, "Techniques for Testing."

for heteroskedasticity. Phillips and Harvey" use recursive residuals for testing serial correlation. Since the recursive residuals are serially uncorrelated and have a common variance cr, we can use the von Neumann ratio tests (described in Chapter 6). There is no inconclusive region as with the Durbin-Watson test.

We will now describe the construction of recursive residuals. First, we order observations sequentially. This is not a problem with time-series data. The recursive residual can be computed by forward recursion or backward recursion. We describe forward recursion only. Backward recursion is similar. The idea behind recursive residuals is this: Let us say that we have observations and the regression equation is

y, = Px, + M, t = 1,2, . . . ,T

Let p, be the estimate of p from the first / observations. Then we use this to predict the next observation y,+,. The prediction is

y,-+, = -+1

The prediction error is

e,+i = y,+i - y,+i

Let us denote V(e,+i) by a. (The variance of prediction error in multiple regression has been discussed in Section 4.7.) Then the recursive residuals, which will denote by ,+,, are

. =

Note that var(M,+,) = a. Now we add one more observation, estimate p using (/ 4- 1) observations, and use this to predict the next observation, y,+2- Thus

y, + 2 - P/+-/ + 2

and if e,+2 = ,+2 - ,+2 and V{e,2) = dUio"-, then



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [ 163 ] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]