back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [ 138 ] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166]


138

Whereas conventional time series prediction uses the immediately preceding points taken at successive and equally spaced intervals in time, nearest neighbour prediction methods use a selection of points from the history of the time series. These points are chosen because they are similar to the current point, in a sense that is defined by the embedding. Unless one chooses to dissect the algorithm it will not be known exactly which points are being chosen to predict any given point. They could be taken from any time during the history and they do not necessarily run consecutively through time.

Two methods of choosing nearest neighbours are possible. In the first method an w-dimensional ball of fixed radius is drawn around the current point as in Figure 13.6b and all points from the time series history that lie within the ball are taken. The larger the radius of the ball the more nearest neighbours; and different points will have different numbers of nearest neighbours. If the markets are behaving normally there could be many points that lie within the nearest neighbour region, but during exceptional periods that may not have occurred in the past there will be very few nearest neighbours within a fixed radius ball. So the forecast will be made with many or few points, depending on the alignments associated with the current point.

Nearest neighbour prediction methods use a selection of points from the history of the time series

The second method of choosing nearest neighbours is to take a fixed number n of nearest neighbours. So when the markets are fairly stable the Euclidean distances between the n nearest neighbours and the current point will be relatively small, but when markets are jumpy these Euclidean distances will be relatively large. Both methods require computation of the Euclidean distance of each point in the library from the current point, and if the library is very large this can be quite time-consuming.

Once the points have been selected the method of forecasting could be virtually anything. For example, Casdagli (1992) uses forecasts based on fairly standard linear regression, but Nychka et al. (1992) use non-parametric regression, Sugihara and May (1990) use simplex projection and Alexander and Giblin (1994) use simplex projection with barycentric coordinates.

Recently nearest neighbour prediction algorithms have received some attention from the finance community. Alexandre et al. (1998) use time-delay embeddings to predict high-frequency returns on the DEM-FRF and USD-FRF exchange rate. Finkenstadt and Kuhbier (1995) claim some success in predicting commodity prices using this method. See also Mizrach (1992) and Jaditz and Sayes (1998).

13.3.3 Multivariate Embedding Methods

It is possible to extend

Nearest neighbour algorithms that are based on univariate series with time- the concept of univariate delay embedding are rather limited, and the results cited above indicate that time-delay embedding to limited success has been achieved with such methods. However, it is possible a multivariate setting



to extend the concept of univariate time-delay embedding to a multivariate setting, and thereby utilize much more information about the market micro-structure and the possible interdependencies with other markets. Multivariate embeddings in m-dimensional space employ lagged values of a variety of possible predictor variables. Examples of possible predictor variables include trading range, volume, and bid-ask spread and each predictor variable could be measured over a variety of possible time-frames. Furthermore, if a lead-lag relationship is thought to exist with another financial instrument then similar data on this market could also be used in the embedding.

The specification of The specification of embedding is a subjective but crucial part of the model. embedding is a subjective Although it may seem appropriate to use microstructure data in the embedding, but crucial part of the the perceived benefits from such an approach have to be weighed against more model tnan a few difficulties: a very large number of different embeddings, having different dimensions, lags and predictor variables will need to be tried and tested. The model training may be analysed into three stages:

» specification of the embedding;

» stating the number of nearest neighbours, or the radius of the ball; »- determining the best prediction method.

For every choice of embedding, of nearest neighbours, and of prediction method, an out-of-sample data set must be produced and tested. Given the large volume of data normally employed for high-frequency analysis, the training and testing of the model is an enormous task. My interest in this area arose in 1995 after receiving details of the first international non-linear financial forecasting competition, sponsored by the NeuroVeSt Journal (Caldwell, 1997). I had read with much interest the accounts of the Santa Fe time series forecasting competition, which was aimed at forecasting a great variety of different time series - including even an encoding of the end of a Bach fugue (Casdagli and Eubank, 1992; Weigund and Gershenfeld, 1993). This new competition was focused exclusively on high-frequency financial data, and aimed at genetic algorithms and neural networks.3

At the time I was working with Peter Williams4 on the applications of neural networks to financial market data, and so it seemed an interesting idea to enter the competition. However, when we received the data they turned out to behave in a very unusual fashion; there were a huge number of highly exceptional returns.5 Successful forecasting of these data with a neural network seemed unlikely, and instead I worked with Ian Giblin to develop a tailor-made

3In fact the first prize included a lot of free neural network software.

4Dr Williams is Reader in the School of Cognitive Studies at the University of Sussex.

5The data provided to entrants for training the model included the time of day, the open, high, low and closing prices and the tic volume for each minute over approximately 4 years (from January 1989 to March 1993). The subsequent 2 years of data were retained for the competition organizers to test the entries. Nothing was revealed to entrants about the source of the data until after the competition, when we were informed that the data were on cotton futures prices.



forecasting model based on our research into chaotic dynamics.6 There are many possible approaches to forecasting high-frequency data from financial markets; looking at the data (as far as this is possible) is a first and very important step in deciding which approach to take.

Alexander and Giblin (1997) developed a multivariate nearest neighbour prediction algorithm for the first international non-linear financial forecasting competition. Of the 1.5 million data points provided for the competition preparation, 1 million were used for training and 0.5 million for testing. The competition rules specified that two minute-by-minute prediction series were to be generated by the program (2 hours ahead and 1 day ahead). The competition would be run on an unseen set of data on the same variable as the one provided for practice and approximately 0.5 million predictions for each time horizon had to be generated in under 4 hours CPU time.

The practice data were very unusual. It seemed to us that although the competition was aimed primarily at neural networks it would be extremely difficult to train a neural network on these data. Instead we looked for possible embedding variables that could be extracted from the data with a view to constructing a nearest neighbour prediction algorithm. The data looked to us like a commodity future, probably a precious metal. Much of the trading in these markets is based on technical analysis so we considered some standard technical indicators as possible embedding variables (Brock et al., 1992; Curcio and Goodhart, 1992; LeBaron, 1992; Murphy, 1986; Harris et al., 1994).7 For each point a fixed number of nearest neighbours were selected in the embedding space and the forecasting models for predictions 2 hours ahead and 1 day head were applied. Decisions on the embedding, the optimal number of nearest neighbours (between 100 and 1000) and the specification of the forecasting models required us to develop some sophisticated tools for choosing model parameters on the basis of forecast analysis. More details are given in Alexander and Giblin (1997).

The price dynamics in financial markets may or may not be governed by chaotic systems. It seems impossible to answer this question at present. Perhaps an answer will be found in the future and, in searching for an answer, many useful forecasting tools could be developed.

Although the competition was aimed primarily at neural networks it would be extremely difficult to train a neural network on these data

6Dr Giblin (giblina pennoyer.net) was my ESRC-funded post-doctoral research student at the time. Our nearest neighbour algorithm won the first prize in the competition - in fact it was the only entry that beat a random walk using the metric chosen for the competition. Almost all the other 35 entries that qualified used neural networks or genetic algorithms. The results were assessed using 15 different statistical metrics (§A.5.3) and the performance measure selected for the competition was the root mean square error.

The embedding variables included moving averages of different lengths and the ratio of trading xolume to trading range.



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [ 138 ] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166]