back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [ 103 ] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150]


103

hand. For example, in electrical circuits we use the differential equations that describe how components interact. By piecing together various components and solving a system of differential equations, we can determine how a circuit will operate. This works, in theory. In reality, we have made many assumptions as to how components operate, the purity of materials, and often linearity assumptions just to permit us to solve the equations. This means that the prediction of performance needs to be verified by actually building and testing the circuit. This error-prone process has been perfected through the years and adjustments made so that engineers can do a fairly good job designing circuits. When applied to market analysis, the processes involved are many times more complex than in circuit design.

Can we ever hope to develop a model of the market? Through the years, we have seen experts evolve who make some good predictions and make a lot of money. They develop a reputation, and often a newsletter providing advice. For awhile, they are quite successful, but then for some reason, their predictions stop being as good, and their performance becomes lackluster. What has happened is that they have clued in to some highly influential parameter that is a good predictor of the market. But, because the market is highly dynamic, that particular indicator is no longer a reliable predictor and they are no longer very successful at predictions. However, their reputation continues.

For a long time, investors have used linear statistics to develop strategies to determine when to invest and when to divest. Moving averages and variance lines are just some examples of this method. The problem with linear statistics is that the underlying market that is providing the data is not linear. Therefore, any method based on linearity assumptions is at best inaccurate and at worst, just plain wrong. When neural networks were introduced to the investment community they were thought to be a panacea. They could handle nonlinearity. As Dr. Steve Gustafson, senior research physicist at the University of Dayton Research Institute, said, "Neural networks are statistics for the uninitiated." This means that someone not trained in the mathematics of statistics could use neural networks to do nonlinear statistical analysis of the markets and make money. Wrong!

What, exactly, are we trying to model? If stock prices go up and down based on investors perceptions, perhaps we should try to model an investor. This would be an impossible task. Since we cannot gather sufficient data on an individual investor, we cannot develop a model. It is far easier to gather data on what all investors are doing. This is usually represented as an index such as the Dow Jones Industrial Average (DJIA). Perhaps we should model the price fluctuations of the shares in various companies. If the goal is to predict a single stock, we have a more difficult problem. If the goal is to predict a broad market index, we are more likely to be successful.

How can we develop a model of many investors and their perceptions in order to make money? Investors are influenced by factors that we can measure. Their perceptions are influenced and reflected by what is happening in the various markets (other than the one we are trying to model) and in reported data. For example, if the price of gold is rising, this may indicate investors are becoming worried about inflation. If the Ml, a broad-based U.S. money supply index, is increasing rapidly, this too may be a



harbinger of inflation. Some investors believe that if the Dow Jones Transportation Index is increasing it means that industry is delivering more goods and will be more profitable; therefore, a good time to invest. These are just some of the many possible indicators that could be used. Some of them may be precursors to market moves in one direction or the other. In any reading of the Wall Street Journal financial section, you can find hundreds upon hundreds of such indicators and can then use them to generate a model.

Our goal is to find a way to predict the future so that we can use this prediction to make money. We have been discussing this approach with the idea that we will create a model of some process. Figure 17.1 illustrates what it is that we are trying to accomplish. In examining this figure, we are led to a logical way to discuss this area of modeling. The breakdown is in three separate areas: input, output, and the model itself.

In regression, or curve fitting, the point that is emphasized is that it is important to have a cause-and-effect relationship between the independent and dependent variables. Even if the results of the regression analysis show that there is a perfect fit to the data, there still must be a cause-and-effect relationship. Suppose I have data that shows that two times the age of my father equals the price of ground beef for the past 75 years. I could use this as a model to predict the price of ground beef in the future. It is obvious, however, that here is no cause-and-effect relationship and therefore the prediction should be suspect. What we are advocating is creating a model using data that may be an influencing factor on the output of the model. The data involved may or may not be obviously related to the model. We are, therefore, loosely interpreting the cause-and-effect relationship in the methodology proposed.

Model Input: What Should We Use to Measure the Market?

When developing a model of a matket, we want to use as many indicators as possible for the input to avoid missing the important ones. It is important to note that an indicator may have made its influencing change at some time in the past. For example, a move in the price of gold in January may not be reflected in a movement of the DJIA until March. Or is it better to use golds move in December or February? Perhaps we need three consecutive values. Or is it two or four?

Another point to consider when gathering and choosing input for a model is the relative range of the data. A neural network type model that is using as inputs the price of gold ($400-$450) and the inflation rate (3%-5%) as inputs influencing the DJIA (6,000-8,000) is going to have trouble. It is difficult for an input variable with a small range of 0.03 to 0.05 to have as much influence on its internal cell calculations as an input variable with a much larger range of 400 to 450. Consequently, a neural network would probably ignore the inflation rate. Therefore, the two most important issues, when using neural networks to model a market, are the normalization and transformation of the input.



After considering the input transformations and normalization, we need to consider exactly how we will use these inputs. We can use each one as an individual input, or we can develop models to predict these values at some point in the future and use the output of these models as input to subsequent models. Often investors must make this decision through intuition or experiments.

Technical market analysts have developed many tools to aid in the task of predicting the market. Many of these use the concept of moving windows (creating a small data series) and then compute salient statistics of that window to use as an input or decision value. As we consider this methodology, we must answer several questions. How wide is the window? What statistic(s) should be computed? What is the proper lag for the value? When does the value influence the DJIA, one week, one month, or more in the future? If we try to use all possible choices, the number of input values becomes unmanageable. One way to reduce the number of inputs is to use the market modelers intuition and experience. Another approach is to use automated methods to eliminate ineffectual inputs.

There are other considerations in developing a market model. Assume that we want to model the DJIA with the goal of predicting what the closing DJIA will be two weeks from now. Takens theorem1 says that we can use past values as input to create a multidimensional input space for predicting future values of the series. While this should work (in theory), market pundits will tell you that the market has no history. While past values may give us a trend, there is no good reason to suppose that the previous values of the DJIA are necessarily influencing the future values. However, since the DJIA only moves up or down within a small range daily, an investor could argue that past values do have an influence. In any case, we should consider other input.

The simplest approach, for the modeler, is to model the market of interest directly. This means using past values, as input, to predict future values. Although this is simple for the modeler, it relies on rather complex mathematical theory, Takens theorem. Therefore, we can use past history information to develop a multidimensional model of the Dow Jones Industrial Average (DJIA). In theory, this should work. However, how many time-lagged inputs are used? Which ones? Are there exogenous events that will affect the developed model? These are all valid questions that the modeler must ask and answer.

Input to the model and the transforms of this data are the most important area for accurate modeling. In summary, the modeler must chose the inputs for the model and the transformations on these values. Then these values must be normalized. If the goal is direct market modeling, using past values to predict future values, the number of inputs to be used must be determined. Often, the only way to do this is trial and error. However, if a model using many different inputs is being developed, there are other considerations. The input variables chosen should be influencing factors. If windows of data are to be used, the size of the window, the window statistic, how many windows, and how far to lag the window must all be decided. This can be by hand or, as we will present later, there are automated methods that will help.



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [ 103 ] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150]