back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [ 76 ] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


76

Illustrative Example: The Density Gradient Model

In Table 5.5 we present data on

= population density

X = distance from the central business district

for 39 census tracts on the Baltimore area in 1970. It has been suggested (this is called the "density gradient model") that population density follows the relationship

r = aZ + px + m • (5.4)

and we find that the residuals are heteroscedastic with variance roughly proportional to Z-. Then we should not hesitate to divide (5.4) throughout by Z and estimate the regression equation

I = a + p + m (5.5)

where u = u/Z has a constant variance or. The estimates of a, p, and should be obtained from (5.5) and not from (5.4). Whether the correlation between Y/Z and X/Z is higher or lower than the correlation between Y and X is irrelevant. The important point to note is that we cannot argue whether (5.4) or (5.5) is a better equation to consider by looking at correlations. As long as we do not base our inferences on correlations, there is nothing wrong with deflation in this case. It should also be noted that if (5.4) is not homogeneous (i.e., it involves a constant term), we end up with an equation of the form

- = y- + + -u

which is different from (5.5). The equation will also be different if the variance of is proportional not to Z but to Z or some other function of Z.

In actual practice, deflation may increase or decrease the resulting correlation. The algebra is somewhat tedious, but with some simplifying assumptions Kuh and Meyer derive the conditions under which the correlation between X/Z and Y/Z is in fact less than that between X and Y.

In summary, often in econometric work deflated or ratio variables are used to solve the heteroskedasticity problem. Deflation can sometimes be justified on pure economic grounds, as in the case of the use of "real" quantities and relative prices. In this case all the inferences from the estimated equation will be based on the equation in the deflated variables. However, if deflation is used to solve the heteroskedasticity problem, any inferences we make have to be based on the original equation, not the equation in the deflated variables. In any case, deflation may increase or decrease the resulting correlations, but this is beside the point. Since the correlations are not comparable anyway, one should not draw any inferences from them.



Table 5.5 Data on Population Density in Different Census Tracts in the Baltimore Area in 1970"

Observation

>

Observation

>

18,640.0

1.002

5,485.8

6.748

38,275.0

1.403

3,416.5

6.882

2,450.3

2.004

8,194.7

6.948

21,969.0

2.138

5,091.9

6.948

9,573.7

2.205

1,183.8

7.082

13,751.0

3.608

4,157.9

7.416

38,947.0

3.675

2,158.3

7.483

17,921.0

4.009

12,428.0

7.617

5,050.7

4.276

6,788.5

7.750

4,519.0

4.410

3,277.4

7.750

6,781.1

4.543

3,258.2

7.951

8,246.2

4.810

5,491.3

8.084

5,166.4

4.944

865.02

11.250

7,762.4

5.211

340.69

13.250

11,081.0

5.345

507.03

15.500

7,188.0

5.679

323.67

18.000

13,753.0

5.813

108.36

19.000

7,492.4

5.813

805.66

23.000

3,620.9

5.879

156.84

26.250

6,390.6

6.080

°y, density of population in the census tract: x, distance of the census tract from the central business district.

Source: I would like to thank Kajal Lahiri for providing me with these data. These data formed the basis of the study in K. Lahiri and R. Numrich, "An Econometric Study of the Dynamics of Urban Spatial Structure." Journal of Urban Economics, 1983, pp. 55-79.

= Ae > 0

where A is the density of the central business district. The basic hypothesis is that as you move away from the central business district population density drops off.

For estimation purposes we take logs and write

log = log A - pjc Adding an error term we estimate the model

y* = a - +

where y* = log and a = log A. Estimation of this equation by OLS gave the following results:

y* = 10.093 - 0.2395 ;

(54.7) (-12.28)

= 0.803



All the /-statistics are significant, indicating the presence of heteroskedasticity. Based on the highest f-ratio, we chose the second specification (although the fourth specification is equally vaUd). Deflating throughout by gives the regression equations to be estimated as

= a-4= + Vx, + error

The estimates were

Vxi Vx,

a = 9.932 and 0 = -0.2258

(47.87) (-15.10)

(Figures in parentheses are f-ratios.) The estimate of (3 is negative and highly significant. The estimated density of the central business district is given by

(Figures in parentheses are the /-values, not standard errors.) The f-values are very high and the coefficients a and (3 are significantly different from zero (with a significance level of less than 1%). The sign of (3 is negative, as expected. With cross-sectional data like these we expect heteroskedasticity, and this could result in an underestimation of the standard errors (and thus an overes-timation of the f-ratios). To check whethere there is heteroskedasticity, we have to analyze the estimated residuals ,. A plot of uf against x, showed a positive relationship and hence Glejsers tests were applied. Defining , by z„ the following equations were estimated:

Zi = + Vi

Zi = yVxi + Vi 1

Zi = y - + Vi Xi

We choose the specification that gives the highest [or equivalently the highest f-value, since R = tVU + d.f.) in the case of only one regressor]. The estimated regressions with f-values in parentheses were

f. = 0.0445X,.

(5.06)

Zi = 0.1733\

(6.42)

Zi = 1.39o(-

(4.50) \Xi/ ( 1

ii= 1.038

(6.42)



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [ 76 ] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]