back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [ 15 ] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


15

the distribution of t under and (3 would be the right-hand tail area of the distribution of / under

The usual procedure that is suggested (which is called the Neyman-Pearson approach) is to fix a at a certain level and minimize p, that is, choose the test statistic that has the most power. In practice, the tests we use, such as the t, X and F tests, have been shown to be the most powerful tests.

9. There are some statisticians who disagree with the ideas of the Neyman-Pearson theory. For instance, Kalbfleisch and Sprotf* argue that it is a gross simplification to regard a test of significance as a decision rule for accepting or rejecting a hypothesis. They argue that such decisions are made on more than just experimental evidence. Thus the purpose of a significance test is just to quantify the strength of evidence in the data against a hypothesis expressed in a (0, 1) scale, not to suggest an accept-reject rule (see item 7). There are some statisticians who think that the significance level used should depend on the sample size.

The problem with a preassigned significance level is that if the sample size is large enough, we can reject every null hypothesis. This is often the experience of those who use large cross-sectional data sets with thousands of observations. Almost every coefficient is significant at the 5% level. Lindley argues that for large samples one should use lower significance levels and for smaller samples higher significance levels. Leamer* derives significance levels in the case of regression models for different sample sizes that show how significance levels should be much higher than 5% for small sample sizes and much lower than 5% for large sample sizes.

Very often the purpose of a test is to simplify an estimation procedure. This is the "pretesting" problem, where the test is a prelude to further estimation. In the case of such pretests it has been found that the significance levels to be used should be much higher than the conventional 5% (sometimes 25 to 50% and even 99%). The important thing to note is that tests of significance have several purposes and one should not use a uniform 5% significance level.

2.10 Relationship Between Confidence Interval Procedures and Tests of Hypotheses

There is a close relationship between confidence intervals and tests of hypotheses. Suppose that we want to test a hypothesis at the 5% significance

"J. G. Kalbfleisch and D. A. Sprott. "On Tests of Significance," in W. L. Harper and C. A.

Hooker (eds.). Foundations of Probability Theory, Statistical Inference, and Statistical Theory

of Science, Vol. 2 (Boston: D. Reidel, 1976), pp. 259-272.

D. v. Lindley, "A Statistical Paradox," Biometrika, 1957, pp. 187-192.

Leamer, Specification Searches.

A good discussion of this problem is in the series of papers "For What Use Are Tests of Hypotheses and Tests of Significance" in Communications in Statistics, A. Theory and Methods, Vol. 5, No. 8, 1976.



> 2.093

since 2.093 is the point from the /-tables for 19 degrees of freedom such that Prob(-2.093 < / < 2.093) = 0.95 or Prob(/ > 2.093) = 0.05. In our example Vniy - 1)1 S = - 3 and hence we reject . Actually, we would be rejecting at the 5% significance level whenever specifies p. to be a value outside the interval (3.6, 6.4).

For a one-tailed test we have to consider the corresponding one-sided confidence interval. If Hg. = 1 and ,: p < 7, then we reject for low values of / = Vniy - 1)/S. From the /-tables with 19 degrees of freedom, we find that Prob(/ < - 1.73) = 0.05. Hence we reject if the observed / < -1.73. In our example it is - 3 and hence we reject . The corresponding 95% one-sided confidence interval is given by

Prob

.1.73<~

= 0.95

which gives (on substituting = 20, = 5, S = 3) the confidence interval

(-00, 6.15).

Summary

1. If two random variables are uncorrelated, this does not necessarily imply that they are independent. A simple example is given in Section 2.3.

2. The normal distribution and the related distributions: /. and F form the basis of all statistical inference in the book. Although many economic data do not necessarily satisfy the assumption of normality, we can make some transformations that would produce approximate normaUty. For instance, we consider log wages rather than wages.

level. Then we can construct a 95% confidence interval for the parameter under consideration and see whether the hypothesized value is in the interval. If it is, we do not reject the hypothesis. If it is not, we reject the hypothesis. This relationship holds good for tests of parameter values. There are other tests, such as goodness-of-fit tests, tests of independence in contingency tables, and so on, for which there is no confidence interval counterpart.

As an example, consider a sample of 20 independent observations from a normal distribution with mean p and variance a. Suppose that the sample mean is = 5 and sample variance = 9. We saw from equation (2.5) that the 95% confidence interval was (3.6, 6.4). Suppose that we consider the problem of testing the hypothesis

: p = 7 against ,: \x¥7

If we use a 5% level of significance we should reject if

V(y - 7)



Exercises

1. The COMPACT computer company has 4 applicants, all equally qualified, of whom 2 are male and 2 female. The company has to choose 2 candidates, and it does not discriminate on the basis of sex. If it chooses the two candidates at random, what is the probability that the two candidates chosen will be the same sex? A student answered this question as follows: There are three possible outcomes: 2 female, 2 male, and 1 female and 1 male. The number of favorable outcomes is two. Hence the probability is f. Is this correct?

2. Your friend John says: "Lets toss coins. Each time Ill toss first and then you. If either coin comes up heads, I win. If neither, you win." You say, "There are three possibilities. You may get heads first and the game ends

3. The advantage of the normal distribution is that any linear function of normally distributed variables is normally distributed. For the x-distribution a weaker property holds. The sum of independent x variables has a x-distribu-tion. These properties are very useful in deriving probability distributions of sample statistics. The t, x, and F distributions are explained in Section 2.4.

4. A function of the sample observations is called a statistic (e.g., sample mean, sample variance). The probability distribution of a statistic is called the sampling distribution of the statistic.

5. Classical statistical inference is based entirely on sampling distributions. By contrast, Bayesian inference makes use of sample information and prior information. We do not discuss Bayesian inference in this book, but the basic idea is explained in Section 2.5. Based on the prior distribution (which incorporates prior information) and the sample observations, we obtain what is known as the posterior distribution and all our inferences are based on this posterior distribution.

6. Classical statistical inference is usually discussed under three headings: point estimation, interval estimation, and testing of hypotheses. Three desirable properties of point estimators-unbiasedness, efficiency, and consistency-are discussed in Section 2.6.

7. There are three commonly used methods of deriving point estimators:

(a) The method of moments.

(b) The method of least squares.

(c) The method of maximum likelihood. These are discussed in Chapter 3.

8. Section 2.8 presents an introduction to interval estimation and Section 2.9 gives an introduction to hypothesis testing. The interrelationship between the two is explained in Section 2.10.

9. The main elements of hypothesis testing are discussed in detail in Section 2.9. Most important, arguments are presented as to why it is not desirable to use the usual 5% significance level in all problems.



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [ 15 ] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]