back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [ 121 ] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166]


121

All yield curves have a high degree of cointegration. Cointegration can also be thought of as a form of factor analysis similar to principal component analysis,3 so it is not surprising that cointegration analysis often works very well on the term structure data that are so successfully modelled by a principal component analysis.

This example shows that cointegration is a powerful tool for the analysis of yield curves. But there are many other applications of cointegration to other financial markets. It is often the case that cointegration arises from stationary spreads, bases, or tracking errors. However, even though spreads/bases/ tracking errors may be stationary it is not always clear that they are the most stationary linear combination. That is, (1, -1) may not be the best cointegration vector. And if the spread/basis/tracking error is not stationary that does not preclude the possibility that some other linear combination of asset prices, or log prices, is stationary.

12.2 Testing for Cointegration

The first step in cointegration analysis is to use standard statistical tests for cointegration to identify stationary linear combinations of the integrated series which best define the long-run equilibrium relationships between the variables in the system, if such relationships exist. Of course, if no such relationship exists then the variables are not cointegrated and there is little point in a multivariate analysis of the price data.

The classic papers on cointegration are those of Hendry (1986), Granger (1986) and Engle and Granger (1987). Engle and Granger proposed a test for cointegration that is based on an ordinary least squares regression. In the Engle-Granger method one simply performs a regression of one integrated variable on the other integrated variables and then tests the residual for stationarity using a unit root test (§11.1.5). Slightly different critical values apply, as described in the help sheet for the Engle Granger workbook on the CD.

Now, it is not standard to do OLS regression on non-stationary data. For example, the factor models described in §8.1 were based on returns data, because the standard theory of regression is based on stationary data. If the dependent variable is non-stationary it is quite possible that residuals will be non-stationary, but the properties of OLS estimators are only established for stationary residuals (§A.1.4). However, there is only one circumstance in which a regression between integrated variables will give stationary residuals, and that is when the variables are cointegrated. Put another way, it is only valid to

3The connection between these two methodologies is that a principal component analysis of first differences of cointegrated variables will yield the common stochastic trend as the first principal component. But the outputs of the two analyses differ: principal components gives a few series which can be used to approximate a much larger set of series (such as the yield curve): cointegration gives all possible stationary linear combinations of a set of random walks. See Gourieroux el al. (1991).

It is only valid to regress log prices on log prices when these log prices are cointegrated

There is only one circumstance in which a regression between integrated variables will give stationary residuals, and that is when the variables are cointegrated



Jul-88 Jul-89 Jul-90 Jul-91 Jul-92 Jul-93 Jul-94 Jul-95 Jul-96 Jul-97 Jul-98

Basis - Spot - Future

Figure 12.2 Daily average spot and future prices for WTI crude oil and the (\nF-\nS).

basis

regress log prices on log prices when these log prices are cointegrated. In this case the regression will define the long-run equilibrium relationship between the log prices.

12.2.1 The Engle-Granger Methodology

Cointegration tests will not produce sensible results if too short a data period is used: they are designed to detect common long-run trends in the variables

The Engle-Granger test is a two-step process: first estimate an OLS regression on the 1(1) data, then apply a stationarity test such as the ADF test to the residuals from this regression. The critical values for this test are given in MacKinnon (1991). In the case of only two 1(1) variables x and y, the Engle-Granger regression is

x, = + ay, + et.

Now x and will be cointegrated if and only if s is stationary. Then the cointegration vector is (1, -a), and the long-run equilibrium relationship between x and is x = + ay. Cointegration tests will not produce sensible results if too short a data period is used: they are designed to detect common long-run trends in the variables. The data period has to be sufficiently long for a stochastic trend to be detected.

Daily WTI crude oil prices from 1 July 1988 to 26 February 1999 have been used to test for cointegration between spot and futures log prices using the Engle-Granger method. Looking at Figure 12.2, it is clear that these series are very closely tied together over the 11 -year period (in fact, it is virtually



/(1) vs /(0)

1(2) vs /(1)

log Spot log Futures

-2.66 -2.66

-24.45 -22.73

impossible to distinguish between the two prices on the graph) and the basis has a very high degree of stationarity because it is very rapidly mean-reverting. But before testing for cointegration it is necessary to establish that the spot and futures log prices are both 1(1). Following the methodology explained in §11.1.5, the ADF(l) statistics in Table 12.2 confirm this.

Moving now to an Engle-Granger cointegration test, the OLS regression of log futures prices on log spot prices gives

and the ADF test for the hypothesis 7(1) against 7(0) on the residuals from this equation indicates a high degree of stationarity (ADF(l) - -30.97). It may be concluded that spot and futures log prices in the crude oil market are very highly cointegrated, with cointegration vector (1, -0.9943). Since this is approximately equal to (1, -1), the results support the standard expectations model that the futures price is the average of all discounted expected spot prices, which implies that the basis In F - In S is a stationary process.

In the more general case, an OLS regression between n different 7(1) co-integrated variables will estimate a linear combination of the 7(1) series that is stationary. The cointegration vector is (1, -8,, . . ., - B„ ,), where pb . . ., P„„! are the coefficients on the n - 1 7(1) variables that are used as explanatory variables, the other 7(1) variable being used as the dependent variable in the Engle-Granger regression. The disequilibrium term z is given by the residuals from this regression.

When n - 2 it does not matter which variable is taken as the dependent variable. There is only one cointegration vector, which is the same when estimated by a regression of x on as when estimated by a regression of on x. But when there are more than two 7(1) series the Engle-Granger method can suffer from a serious bias. That is, different estimates of a cointegration vector are obtained depending on the choice of dependent variable, and only one estimate is possible even though there can be up to n - 1 cointegration vectors. Using the stock index data provided with the Engle-Granger workbook, readers can investigate this bias, and compare results with those based on the more powerful Johansen method (§12.2.2).

When there are more than two 1(1) series the Engle-Granger method can suffer from a serious bias

In F, = 0.016404 +0.99431n S,

(4.05) (730.87)

Thus the Engle-Granger method cannot be used to identify all the independent cointegration vectors in a system with more than two variables. Only one

Table 12.2: ADF tests for a unit root in crude oil spot and future prices



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [ 121 ] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166]