back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [ 31 ] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


31

Although it is customary to use the 5% probability level for rejection of the suggested hypothesis, there is nothing sacred about this number. The theory of signiiicance tests with the commonly used significance levels of 0.05 and 0.01 owes its origins to the famous British statistician Sir R. A. Fisher (1890-1962). He is considered the father of modern statistical methods and the numbers 0.05 and 0.01 suggested by him have been adopted universally.

Another point to note is that the hypothesis being tested (in this case = 1) is called the null hypothesis. Again the terminology is misleading and owes the origin to the fact that the initial h5 otheses tested were that some parameters were zero. Thus a hypothesis = 0 can be called a null hypothesis but not a hypothesis = 1- In any case for the present we will stick to the standard terminology and call the hypothesis tested the null hypothesis and use the standard significance levels of 0.05 and 0.01.

Finally, it should be noted that there is a correspondence between the confidence intervals derived earlier and tests of hypotheses. For instance, the 95% confidence interval we derived earlier for p is (0.16 < < 1.34). Any hypothesis that says p = p, where Po is in this interval, will not be rejected at the 5% level for a two-sided test. For instance, the hypothesis p = 1.27 will not be rejected, but the hypothesis p = 1.35 or p = 0.10 will be. For one-sided tests we consider one-sided confidence intervals.

It is also customary to term some regression coefficients as "significant" or "not significant" depending on the /-ratios, and attach asterisks to them if they are significant. This procedure should be avoided. For instance, in our illustrative example the regression equaticm is sometimes presented as

= 3.6 -F 0.75*jc

0.09» <0.256)

The * on the slqpe coefficient imiicates that it is "significant" at the 5% level. However, this statement means that "it is significantly different from zero" and tins statement is meaningful only if the hypothesis being tested is = 0. Such a hypothesis would not be meaningful in many cases. For instance, in our example, if is output and .x is labor-hours of work, a hypothesis p = 0 does not any sense. Similarly, if is a posttraining score and x is a pretraining sc<Mie, a hypothesis p = 0 would imply that the pretraining score has no effect cm the posttraining score and no one would be interested in testing such an extreme hypothesis.

Example of CcMnparing Test Scores from the GRE and GMAT Tests*

In the College of Business Administration at the University of Florida, two different tests are used to measure aptitude for graduate work. The economics department relies on GRE sc(«res and the other graduate departments rely on

I would like to thank my colleue Larry Kenny for this example and the computations.



GMAT scores. To allocate graduate assistantships and fellowships, it is important to be able to compare the scores of these two tests. A rough rule of thumb that was suggested was the following.

ORE score = 2(GMAT score) + 100

A question arose as to the adequacy of this rule.

To answer this question data were obtained on 262 current and recent University of Florida students who had taken both tests. The GRE and GMAT scores were highly correlated. The correlation coefficients were 0.71 for U.S. students and 0.80 for foreign students.

If we have GRE scores on some students and GMAT scores on some others, we can convert all scores to GRE scores, in which case we use the regression of GRE score on GMAT score. Alternatively, we can convert all scores to GMAT scores, in which case we use the regression of GMAT score on GRE score. These two regressions were as follows (figures in parentheses are standard errors):

Students

Numbers

Regression

Regression of GRE on GMAT

GUE = 333 + 1.470GMAT

.0871

U.S.

GUE = 336 + 1.458GMAT

(O.IOU

Foreign

GRE = 284 + 1.606GMAT

(0.183)

Regression of GMAT on GRE

GIVIAT = 125 + 0.363GRE

(0.021)

U.S.

GKlAT = 151 + 0.345GRE

(0.024)

Foreign

GKlAT = 60 -H 0.400GRE

(0 046)

Although we can get predictions for the U.S. students and foreign students separately, and for the GRE given GMAT and GMAT given GRE, we shall present only one set of predictions and compare this with the present rule. The predictions, from the regression of GRE on GMAT (for all students) are as follows:

GMAT

Current Rule"

1000

1068

1100

1142

1200 .

1215

1300

1289

1400

1362

1500

-ORE = 2(GMAT) + 100.



952 -

= 2.23

= <fS„ = 2.23/668 = .( 334 SE(P) = .058

= Si,/(S„) = .979 Thus, the regression equation is

Compared with the regression with the constant term, the results look better! The r-ratio for p has increased from 3 to 20. The has increased dramatically from 0.52 to 0.98. Students often come to me with high s (sometimes 0.9999) when they fit a regression with a constant term and claim that they got better fits. But this is spurious; one has to look at which equation predicts better.

Some computer regression programs allow the option of not including a constant term but do not give the correct r. Note that 1 - = (Residual Sum of Squares)/Syy. If the residual sum of squares (RSS) is calculated from a regression with no constant term, but 5„ is calculated with the mean correction, then we can even get a negative r. Even if we do not get a negative r, we can get a very low r*. For instance, in the example with data from Table 3.2, we have RSS = 20.08 from the regression with no constant term and 5„, (with mean correction) = 30.4. Hence, f = 0.34 when calculated (wrongly) this way.

Thus, except for scores at the low end (which are not particularly relevant because students with low scores are not admitted), the current rule overpre-dicts the GRE based on the GMAT score. Thus, the current conversion biases decisions in favor of those taking the GMAT test as compared to those taking the GRE test. Since the students entering the economics department take the GRE test, the current rule results in decisions unfavorable to the economics department.

Regression with No Constant Term

Sometimes the regression equation is estimated with the constant term excluded. This is called regression through the origin. This arises either because economic theory suggests an equation with no constant term or we end up with a regression equation with no constant term because of some transformations in the variables. (These are discussed in Section 5.4 of Chapter 5.)

In this case the normal equations and other formulas will be the same as before, except that there will be no "mean corrections." That is S,, = 2j<. = S.y,- and = yf, etc. Consider the data in Table 3.2. We have

S„ = 668, 5 = 789 and S,, = 952 = SJS = 789/688 = 1.181



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [ 31 ] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]