back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [ 104 ] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]


104

For X = 12: x, =

VR Vm Vm

Note that these vectors are orthogonal to each other. That is, xjxj = xjxj = = 0.

Properties of Characteristic Roots and Vectors

Let X, X,, . . . , X„ be the n characteristic roots and X), x,, . . . , x„ be the corresponding characteristic vectors of the matrix A. We shall state some important properties of the characteristic roots and vectors.

1. The maximum value of xAx is the maximum characteristic root. Proof: x,Ax, = X,x,x, = X,. Hence the result follows.

2. If X and X, are two distinct characteristic roots, then xjxj = 0 or the corresponding characteristic vectors are orthogonal.

Proof: Ax, = X,x,. Hence XjAx, = X,X2X,. Axj = X2X2. Hence xAx2 = X2x;x2. By subtraction we get 0 = (X, - X2)x;x2. But since X, Xj we have xjx, = 0.

3. A = X,X2- • -K and Tr(A) = X, + X, + • • • -I- X„.

Proof: Let X be the matrix whose columns are the characteristic vectors of A. That is, X = [x,, X2, . . . , xJ. If the X, are all distinct, the columns of X are orthogonal. Also, x,x, = 1 for all /. Thus X is an orthogonal matrix. Hence X" = X (see the Appendix to Chapter 2) and XX = I. Now A(x,, Xl, . . . , x„) = (XX,, X2X2, . . . , X„x„) or AX = XD, where D is the diagonal matrix

with all nondiagonal terms zero. Therefore, XAX = XXD = D. XAX = D or XHaHX = X1X2 • • • X„. But since X is an orthogonal matrix, X-X = 1. Hence we get A = X, X2 • • • X„. Also, Tr(D) = X, -1- X2 + • • • + X„ and Tr(XAX) = Tr(AXX) = Tr(A) since XX = I. Hence Tr(A) = X, -I- X2 + • • -I- X„.

Note: Although we have proved the results above for distinct roots, the results are valid for even repeated roots. That is, given a symmetric matrix A, there exists an orthogonal matrix X (whose columns are the characteristic vectors of A) such that XAX = D, where D is a diagonal matrix whose elements are the characteristic roots of A.

4. Rank A = Rank(D) = the number of nonzero characteristic roots of A. Proof: Since rank is unaltered by pre or post multiplication by a nonsingular matrix, Rank(A) = Rank(XAX) = Rank(D).

5. The characteristic roots of A are the squares of the characteristic roots of A, but the characteristic vectors are the same.



Proof: XXX = Dor A = XDX. A = {XDX)(XDX) = XDX since XX = I. Thus the characteristic roots are given by the diagonal elements of D (i.e., X?) and the characteristic vectors are given by the columns of X. 6. If A is a positive definite matrix, all the characteristic roots are positive. Proof: Consider Ax, = \,x,. xAx > 0 if A is positive definite. Since xjx, = 1, we have Kj > 0. By a similar argument we can show that if A is positive semidefinite, \ > 0. If A is negative definite, X, < 0 for all j. If A is negative semidefinite, X s 0 for all j. Note that for the symmetric matrix we considered earlier, the roots were 0, 5, and 12. Thus it is positive semidefinite. j

The Case of a Nonsymmetric Matrix

The preceding results are for symmetric matrices. In econometrics we encounter nonsymmetric matrices as well (the case of VAR models in Chapter 14). For these matrices the characteristic vectors are not orthogonal, as we saw earlier. However, some of the other results are still valid. For example, the resuU sum of the characteristic roots of A = Tr(A) is vaUd.

Consider the equations Axj = X.x,, = X2X2, and so on, which we solve to get the characteristic vectors Xj, Xj, . . . . We can write these as

A(x,x2 • • • x„) = (X,x,, Xx2, . . . , X„x„) AX = XD

where X is the matrix whose columns are the characteristic vectors and D is a diagonal matrix with X,, X2, . . . , X„ as the diagonal elements. Premultiplying both sides by X~, we get

XAX = D

(We have assumed that X is nonsingular. This can be proved, but we omit the proof here.) Thus given a square matrix A, we can find a nonsingular matrix X such that XAX is a diagonal matrix with the characteristic roots of A as the diagonal elements. The columns of X are the corresponding characteristic vec-

tors. Also, Tr(X AX) = Tr(D) = ButTr(X AX) = Tr(AXX ) = Tr(A).

Hence Tr(A) = 2X,. This can be checked with the two examples considered earlier. In the case of the nonsymmetric matrix, 2X, = 9 and Tr(A) = 9. In the case of the symmetric matrix, SX, = 17 and Tr(A) = 17.

Principal Components

Consider a set of variables Xy, Xj, . . . , x„ with covariance matrix V. We want to find a linear function ax that has maximum variance subject to aa = 1. The problem is similar to the one we considered earlier. We have to solve V - XI = 0. The maximum characteristic root of V is the required maximum value and the corresponding characteristic vector is thd required a.

Let us order the characteristic roots X„ Xj, . . . , X„ in decreasing order; let the corresponding vectors be x,, x2, . . . , x„. Consider the linear functions z, = a;X, z2 = a;X.. . . , z„ = aX. then V(z,) = a,Va, = X,, ¥(23) = Xj, . . . . The zs are called the principal components of the xs.



They have the following properties:

1. var(z,) + var(z2) + + var(z„) = X, + + • + = Tr(V) = var(x,) + var(x2) + • var(x„).

2. Since (ttittj • • • a„) are orthogonal vectors, z,, Zj, . . . , z„ are orthogonal or uncorrelated. The drawbacks of principal component analysis have been discussed in the text.

Ridge Regression

If (XX) is close to singularity, the problem can be solved by adding positive elements to the diagonals. The simple ridge estimator is

P« = (XX + XI)-Xy

There are several interpretations of this estimator (discussed in the text). One is to obtain the least squares estimator when fi} = c. Introducing the Lagrangian multiplier X, we minimize

(y - Xp)(y - Xp) + X(PP - c)

Differentiating with respect to p, we get

-2Xy + 2XXp -I- 2Xp = 0 or (XX-f XI)p = Xy

This gives the ridge estimator. X is the Lagrangian multiplier or the "shadow price" of the constraint.



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [ 104 ] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212]