back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [ 152 ] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166]


152

(a) x

lr = 0)/ 1),

where the conditioning statement on the sample data has been dropped for brevity of notation.21 The general form of likelihood ratio test is given by the test statistic -2 In lr, which is asymptotically chi-squared distribution with q degrees of freedom:

-2 In lr = 2(lnL(61) - In l(e0)) « ]. (A.6.2)

This test has already been encountered in §A.2.5 in the context of testing a set of linear restrictions on the parameters of a linear regression model when the errors are normally distributed. If they exist the likelihood ratio tests have power properties that are better than those of the Wald and lm tests - in fact they are the uniformly most powerful tests of a given size.

21For the two-sided alternative 6 60 use the MLE instead of , in the likelihood ratio.



A.6.2 Properties of Maximum Likelihood Estimators

The MLE of a parameter 0 solves

\ . . .,

xn)/dBi = 0 (i=\, q),

provided the matrix of second derivatives is negative definite. But, being a product of density functions which are typically fairly complex, it is not straightforward to calculate the derivatives of L(Q\xu . . ., x„). It is much easier to differentiate the log-likelihood function lnL(0xb . . ., x„), that is. the sum of the log densities:

Since the optima of L are the same as those of In L, it is standard to find the MLE as the value of 0 that maximizes the log-likelihood (see Figure A.llb).

MLEs do not necessarily have good small-sample properties - for example, the variance estimator in (A.6.5) below is biased. However, under standard data regularity conditions, MLEs are consistent, asymptotically normally distributed and asymptotically efficient. That is, they have the lowest variance of all consistent asymptotically normal estimators. In fact the asymptotic covariance matrix of MLEs achieves the Cramer-Rao lower bound for the variance of unbiased estimators. This bound is the inverse of the information matrix 1(0), where

that is, minus the expected values of the second derivatives of the log-likelihood function. In large samples MLEs have the minimum variance property, with covariance matrix I(0) 1. Statistical inference on MLEs follows from the convergence of their distribution to the multivariate normal JV(0, 1(0)").

Another feature that makes MLEs among the best of the classical estimators is that the MLE of any continuous function g(9) of a parameter 9 is g(9), where 9 is the MLE of 9. Thus it is a simple matter to find MLEs of standard transformations or products of parameters if the individual parameter MLEs are known.

A.6.3 MLEs for a Normal Density Function

Financial returns are often assumed to be generated by normal distributions. The probability density function for a normal random variable with mean and variance cr is

(A.6.3)

1(0) = -E[d2 In )/ ],

f(x) = (l/Vo-2)) exp(-(x - )2/2 2),



In f(x) = -11 (2 ) -1 In -2 - \ (x - p)2/o2

-2 In f{x) = 1 (2 ) + In c2 + (x - \i)2/o2.

and the simplest form for the normal likelihood is

-2 In L(u, rj2xb . . ., x„) = 1 (27 ) + 1 -2+ - p)2/rj2. (A.6.4)

For this reason the maximum likelihood estimates of normal density parameters are usually found by minimizing -2 In L. Differentiating (A.6.4) gives

8(-2 In L)/d]i = -(2/o-2) (x, - p), 9(-21n L)/,3o-2 = / -2 - (1 / -4) ( -,- - p)2.

Therefore the two first-order conditions for maximization are

E (* - & =0

= £>,--u)2.

Solving for p and rj2 gives the familiar estimators that are the MLEs of the mean and variance parameters of a normal density:

* = = * (A.6.5)

To find the standard errors of these estimators and make inference on models estimated by MLE we need to compute their covariance matrix. This is estimated by putting the MLE values for 0 into the inverse information matrix I(0)-1 -see §A.6.2. For a normal density the matrix of second derivatives of the log-likelihood is

/ -2 £(x,-u)/o-4

E (x, ~ lO/o"4 E (x, - p)2/o-6 - /2 -4 J

[(32inL(e)/9e<3ej - ; / 2

so multiplying by -1 and taking expectations, we obtain

/ -2 0

1(9) V /9 4

V 0 /2

Finally, the asymptotic covariance matrix of the MLEs is the inverse of this,



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [ 152 ] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166]