back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [ 145 ] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205]


145

output), it will need more data to test these results. Of the remaining 30°o of the data. 20°o should be designated for testing. The success or failure of the method to find a solution is based on the performance of the test data. The remaining 10% is saved for out-of-sample validation.

Weighting factors are found using a method called a genetic algorithm, discussed in the next section. For now, we need to know that the training process begins with an arbitrary or random value assigned to the weighting factors of each input. As the training proceeds, these weighting factors are randomly mutated, or changed, until the besl combination is found. The genetic algorithm changes and combines weighting factors in a manner referred to as "survival of the fittest" giving preference to the best and discarding the worst.

21Jiin-ayA Eueeicro.Jr, "Build area! neural net," Futures iJune 1991, tiie fir.-t a series eKcellent articles on this subject

Testing is completed when the results of the test data converge to a single value. That is, after a number of feedbad; loops, the test data is used with the new weighting factors, if the results are improving the process continues If the results are improving at a very slow rate, or have become stable at one level, the neural network process is considered completed. Sometimes the results get worse, rather than better. This can be fixed by beginning again using new random weighting fadors. if this doesnt work, then the inputs must be reevaluated for relevance.

Because the genetic algorithm is a trial-and-error process, rather than an analytic approadi, the best results could be found by coincidence, rather than by cause and effect. With enough data series it is always possible that two series of events will appear related, although they are not. It is necessary to review the results to avoid simple mistakes

A Training Example

We would like to train a neural net to tell us whether we should buy or sell stod;s. As inputs, we seled what we believe to be the five most relevant fundamental fadors: GNP unemployment, inventories, U.S. dollar index, and short-term interest rates. This test does not use any preprocessed data, such as trends or indicators. To simplify the process, the following approach is taken:

l.Each input is normalized so that it has values between +100 and-100, indicating su-ength to weakness, with 0 as neutral.

2. When the combined values of the five indicators exceed +125, we will enter a long position; when the combined value is below -125, we will enter a short.

3. Combined values between +125 and -125 are considered neutral.

To show how the training process works, two events are shown in Tables 20-4 and Table 20-5 as Case I and Case 2. Table 20-4 is the initial state of the neural network, where we have chosen to set all of the weighting fadors to I.O. In adual training, the network might require that the sum of all weights total to I.O. The actual values of the normalized inputs are shown in the columns marked "relative value," and the corred historic answer is at the bottom, marked as "strong" and 4veak" stock market readions to these values. For the neural network to retum the corred answers, it must produce a result greater than +125 for Case I and below -125 for Case 2. By assigning initial weights of I.O to all inputs, the value of Case I is+75 andthe value of Case 2 is+40-1 both fail and testing continues searching for better weighting fadors.

TABLE20-4 Two Training Cases (Initial State)

Strri

Weak

Unemplo/mern

High

«

Neutral

U.S. dollar

VryStrs

Neutral

Interest rates

Falling

Rising

Tool value

Threshold

+/-I2S

t;-i25

Current response

None

None

Actual marliet reacuon

Strong

Weak

Soo™:ivt7»»*n.n,WttrTnxbs(McGf»-HII,1MS)



TABLE20-5 Two Training Cases (After Mutated Weighting Factors) Cow/

Strong

Weak

Unemployment

High

Neutral

. . dollar

VryStiS

Neutral

Interest rates

Falling

-1.0

Rising

Total value

-130

Threshold

+I~115

+;-i2S

Current reiponie

Non*

None

Actual rnarket reai

:tion

Strong

Weak

In Table 20-5, the weighting factors have undeigone mutation using a genetic algorithm. This example attempts to use only weighting factors of+1.0, 0, and -1.0. By reversing the effect of unenployment and interest rates on the direction of the stock market, and by random selection of weighting factors, the results now match the historic pattem of stock movement. Using fractional values for weighting factors, and many more frsining cases, the ANN method should find the current underljing relationship between these inputs and stock market movement.

Reducing the Number of Decision Levels and Neurons

The robustness of a neural network solution is directly related to the number of decision levels, the number of neurons in each level, and the total number of inputs. Fewer elements produce more generalized and, therefore, more robust solutions. When there are many decision layers and many neurons, then the inputs can he combined and recombined in many different wajs allowing very apecific patterns to be found The more apecific, the greater the chance that the final solution will be overfitted, or that the neural network program will fail because it cannot converge on a satisfactory result.

The trade-offs usmg a neural network are the same as most other optimization methods. Too many inputs and combinations require lengthy testing and a greater chance of a solution that is overfit. Too few values can produce a result that is too general and not practical. It is best to begin with the most general and proceed in clear steps toward a more apecific solution, in this way you will understand the process better, ultimately save time, and be able to stop when you have readied the practical limitations of this method

GENETIC ALGORITHMS

The concept of a genetic algorithm is based on Darwins theory of survival of the fittest. In the real world, a mutation with traits that improve its ability to survive will continue to procreate. This has been applied to the process of finding the best sjstem parameters. Although a genetic algorithm" is actually a sophiaticated search method that replaces the standard optimization, it uses a technique that parallels the survival of the fittest. It is particularly valuable when the number of viables is so large that a test of all combinations is impractical. Instead of the tjpical sequential search, it is a process of random seeding,

13 This tecliuique IE attributed to j"UuH H"llandA.1.4+ationE in Natural Laueuage and Artificial cysteniE iTTmver.-ity Michigan iTess, 1975.1

selection, and combination to find the best set of trading rules. Standard statiatical criteria are used in the selection process to qualify the results.

ntation of a Genetic Algorithm

Using the words common to this methodologj, the most basic component of a genetic algorithm is a gene; a number of genes will comprise an individual, and a combination of individuals (and therefore genes) is a chromosome Each chromosome represents a potential solution, a set of trading rules in which the genes are the apecific values and calculation expressions, and these form individuals that represent rules and ultimately form a trading strategj. For



example. Chromosome 1 might he a rule to buy on strength:

If a lO-day moving average is less than yesterdays close and a 5-day stochastic is greater than 50, then buy Chromosome 2 could be a rule that buys on weakness.

If a 20-day exponential is less than yesterdays low and a lO-day RSI is less than 50, then buy

If we rewrite these two chromosomes in a notational form, the genes and individuals in its structure become more apparent:

Chromosome 1: MA, 10. <. C. [0]. &. Stoch, b, >, 50,1 Chromosome 2: Exp. 20. <. L. [1]. &. RSI, 10. <. 50.1

Each of these chromosomes has 11 genes, any one of which can be changed. In addition, each chromosome has two individuals, separated by an "&" operator. In Table 20-6, the description of the genes indicates other values that can replace the current ones. Table 20-6 is a way to represent the chromosomes and individuals in a general form, it is easj to see that each gene can be changed, and each change will represent a new trading rule. A combination of trading rules, or chromosomes, will create a trading strategj. Before continuing, the following steps will be needed to use the genetic algorithm to find the best results:

1. A clear way of representing the chromosomes, or individuals

2. A criterion to decide that one chromosome is better than another

3. A selection procedure that determines which chromosomes will survive, and in what manner

4. A process for mutation (introducing new characteristics) and crossover (combining genes) to procreate chromosomes with greater potential

TABLE 20-6 Functional Description of the Genes in Chromosomes A and 2

<

<

4

&

&

Sloch

>

<

A trend calculation (moving average, linear regression, breakout] The calculation p«riod for the trend calculation llational operator (<. <=. >)

Price used in trend calculation ((H + L + C)/3. Indexed vdue) Reference data ([0] - current d. [1] = previous day) Method or « ir>d)viduab (and. or) indicator cakulaticin Indicator calculation period Relational operator

Comparison value for relationai operator ii I 1 Marketacdon(l=Bux.-i=Sell)

Fitaess

Having represented the chromosome in Table 20-6, we next must define a fitness criterion, which ranks the results. Because fitness will lead to survival, it is very important to decide which chromosomes should be discarded and which should be used fiirther. A fitness criterion must combine the most important features associated with a successful trading strategj:

1. Net profits, or profits per trade

2. The number of trades, or an error criteria

3. The smoothness of the results

Ideally, we prefer sjstems that have large profits, lots of trades, and very consistent performance. To measure that result, the following might be used:



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [ 145 ] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205]