back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [ 82 ] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150]


82

The system is much too simple to use as a trading system. Although it generates significant profits in backtesting, it also has large drawdowns. Before we begin to develop a usable trading system, it is worthwhile to consider some important concepts in system design.

Considering Complexity in the Design and Testing of Systems

System design and development utilizes past data to develop rules that will assist in the prediction of the future. In considering a data time series, such as market prices, we seek to develop a formula that adequately quantifies a series of discrete historical points, and use that formula to predict where future points will be.

An indicator, and a system using the indicator, can be designed to any level of complexity, by adding more and more variables. Generally, the more complex the system, the better it will fit historical data. But, how complex should our systems be? If an expert system has too few variables, the data is not sufficiently represented by a line through the points. There is too much distance from the data point to the line for the predictive value to be useful to the trader.

As seen previously, the formula for a straight line is:

Y=a+bX

If we desire to draw a curve through the data, additional variables must be added to the equation. If we add a third variable, c, to the equation, we have a parabola, which is represented by the formula:

Y=a+bX + cX2

If we add more terms, the curve can move up and down, closely following the data points over time. The more terms that are added, the better the curve will fit the sample data.

For a polynomial, we measure the flexibility of a formula by noting its degrees of freedom, or DOF. This is defined as the number of variables in the formula. For a parabola, for example, there are three coefficients, a, b, and c. Thus the degrees of freedom is three.

If the DOF is high, the model tries to track each specific data point. Thus, a curve through the data may accurately describe the data that was used in finding the values of the coefficients. However, if the curve is carried forward (or backward) to a different data set, it will likely not adequately represent the new data.

The development of a rule-based technical trading system usually begins with the use of an indicator that presents data in a different form. This may be a graphical presentation, such as a moving average or an oscillator, or the identification of an occurrence, such as the highest high in 10 days. The indicator can then be used to



Considering Data Adequacy in the Design of a System

It stands to reason that if a system is optimized and backtested using inadequate data, the results cannot be relied on. A perfectly good system may be discarded because of seemingly poor results, or a mediocre system may be adopted based on good results.

Data adequacy involves issues of quality and quantity. With respect to quality, only accurate data should be used. Accurate data is exchange data, which has been verified and corrected for bad ticks. This means that traders should not use data that they have collected, but rather data purchased from a reliable vendor.

A standardized methodology should be used for data management. This should include, for example, how holidays or shortened sessions are handled, or whether evening trading sessions should be included in the data.

In addition, to allow for comparison of systems, standard methods should be adopted for creating continuous contracts from the contract data. Consideration should be given to the method for adjusting prices and the dates of rollover.

Data should be purchased from a single data vendor, since the software of each vendor for creating continuous contracts handles the task differendy. Since prices in continuous contracts may not have actually occurred in the real world, the results of testing a system with data from different vendors may not be comparable.

Finally, the trader should consider how to handle outliers, which are data points that are more than a predetermined distance, such as three standard deviations, from an expected value.

design the rule-based system. The system consists of a series of IF . . . THEN, or IF . . . THEN . . . ELSE statements. For example, a simple system is:

Entry:

IF the close today is above todays 10-day moving average of the close, THEN buy tomorrow at the open.

Exit:

IF there is an open long position, THEN exit at the close.

This appears to be a very simple system, consisting of only one indicator, an entry rule, and an exit rule. However, as simple and basic as it appears to be, it must be recognized that it already has a high level of complexity. The italic words are terms that can be changed. There are already 10 variable terms for the buy rule alone. If we add a second indicator, if we allow for multiple contacts, if we hold the trade for more than a day, or if we add stop loss rules, the system then becomes vastly more complex.

A trading system increases in complexity as each new indicator, stop, or rule is added. In designing our system, we should strive for the simplest system that yields good results. In that way, our system has a better chance of having ongoing predictive value.



For a system that utilizes multiple data streams, consideration must be given to the mutual correlation of data. If one data stream is very much like the other, we may be adding system complexity, but providing no new information.

If one data stream leads the other, that is useful additional information. The additional data stream can be used if a method is found to decorrelate the data. Basically, we are trying to mathematically eliminate the similarity, and accentuate the difference. Traders can do this with statistical analysis software. In a situation where one data stream leads the other, we designate the leading indicator as the independent variable, and the market we wish to trade as the dependent variable. We then identify the amount by which one leads the other.

Data adequacy also involves the quantity of the data used. Enough data must be used so that the results are statistically relevant. For example, the results derived from flipping a coin 10 times give us little information, and cannot be relied on, since any pattern can show up with such a small sample. However, if we flip a coin 100 times, it is likely that a close to even distribution of heads and tails will develop, assuming an evenly balanced coin.

If not enough data has been utilized to determine the coefficients of the polynomial equation, it is said that the model is underconstrained. In this case, there may be many combinations of variables that define the model, some of which may be accurate, and some of which are grossly inaccurate.

In practice, this should not present a problem to traders, since generally a large amount of data is available, and they should use all that is available. It may, however, become a problem if traders try to develop their model on a recently issued stock or a new commodity contract.

The amount of data used should also cover enough time to be representative of the market we are trying to model. For example, if the data used does not encompass a number of the short-term or seasonal cycles, or business cycles, it cannot be expected to handle unseen data.

System complexity and data adequacy must be kept in mind when developing any trading system. In developing a system as an example, we will take these factors into consideration at each step.

Developing a Simple System based on a premise

We start to design a trading system based on our perceptions of the market. We have shown that a straight line may create a usable model for the data. We have also shown that a change of trend of the T-bond market can be used to signal a trading opportunity in the S&P. If we transform the T-bond prices to mirror those reflected by the trend of the S&P, we can use that to directly predict a change in S&P prices.

The transformation of a T-bond price to an equivalent S&P price is akin to transforming degrees Centigrade to degrees Fahrenheit. The former scale goes from a



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [ 82 ] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150]