back start next


[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [ 113 ] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134]


113

operations used to combine them. Contrary to GP, the gene does not offer the flexibility to the algorithm but only to the parameters. The fitness function will be based on the return obtained from the recommendations of a given trading model.

Sharing Scheme For Multi-Modal Functions A major problem in trading model optimization is to obtain models that are robust against market changes and random noise in data collection. In such optimization problems, sharp peaks of high fitness are usually not representative of a general solution, but rather they indicate accidental fluctuations. Such fluctuations may arise out of inherent noise in the time series or due to threshold effects in the trading model performance. Peaks in such a discontinuous, noisy and multimodal fitness space generally correspond to trading models that will not perform well in out-of-sample tests.

In the context of genetic algorithms, optimizing multimodal functions has been investigated using methods inspired from the natural notions of niche and species, as noted by Goldberg and Richardson (1987); Deb and Goldberg (1989), and Yin and Germay (1993). The general goal is to be able to create and maintain several subpopulations, ideally one per major peak of the fitness function, instead of having the whole population converging to one global optimum.

One of the best methods was proposed by Goldberg and Richardson (1987). The idea is that the GA perception of the fitness function is changed in such a way that when individuals tend to concentrate around a high peak, the fitness is reduced by a factor proportional to the number of individuals in the region. This has the effect of diminishing the attractiveness of the peak and allowing parts of the population to concentrate on other regions. This effective fitness of an individual i, called shared fitness sf, is given by

/(0 m(i)

where /(/) is the original fitness and m(i) is called the niche count. For an individual i, the quantity m(i) is calculated by summing up the sharing function values contributed by all N individuals of the population,

m{i) = J2Mdij) (11.25)

where djj is the distance between two individuals / and j and

1-(%T tt4j<* ( .26)

0 otherwise

Mdu)

The quantities a and cr4 are constants. A difficulty of this method is in choosing the value of as. This requires prior knowledge about the number of peaks in the solution space. In our economical application as well as in many realistic problems, this information is not readily available.

A method is proposed in Yin and Germay (1993) based on a different sharing scheme and using an adaptive cluster methodology. The authors show that this



method is effective in revealing unknown multimodal function structures and is able to maintain sub-population diversity. This method establishes analogies between clusters and niches in the following way. The GA population is divided by the adaptive MacQueens KJVIEAN clustering algorithm, in clusters of individuals that correspond to niches. The shared fitness calculation is the same as in the classical sharing method, but the niche count m(i) is no longer associated with as. In this case, the number of individuals within the cluster to which the individual ( belongs plays a central role in the niche count calculation. As the number of clusters is associated with the number of niches (peaks), the individuals are put into a single partition of clusters, where is not fixed a priori but is determined by the algorithm itself. Therefore no a priori knowledge about the numbers of peaks of the fitness function is required as in the classical sharing method. The niche count m(i) is computed as

where Nc is the number of individuals in the cluster c, cc is a constant, and d/c is the distance between the individual / and the centroid of its niche. The algorithm requires a distance metric in order to compute the distance between two clusters and the distance between one individual and one cluster. Two clusters are merged i f the distance between their centroids is smaller than a threshold parameter Dmm. Moreover, when an individual is further away than a maximum distance Dmax from all existing cluster centroids, a new cluster is formed with this individual as a member. The efficiency of the algorithm is improved by sorting the population in descending order according to the individuals fitness before the application of the clustering.

Such genetic algorithm with sharing and clustering has been applied to standard multimodal and continuous fitness functions by Yin and Germay (1993) and Chopard et al. (1995) with promising results. One example of a more complex application was the determination of the optimum parameters of the business time scale,9 which is used for analyzing price history and computing indicators. In this example, the optimization is quite difficult because we have to optimize simultaneously 17 parameters and the function is nonlinear in some of its parameters. To solve such a problem, it was necessary to add a normalization of the parameter space in the genetic algorithm-that is, each parameter is only allowed to vary in the range [0,1 ]. In simple problems, the two clustering parameters are generally set to Dmjn = 0.05 and Dmax =0.15. But here, because of the large dimensionality of the parameter space, the value of the clustering parameters Dm;n and Dmax must be much larger. In this case, the two parameters are multiplied by /n where n is the number of parameters to be optimized. The results obtained with this genetic algorithm are very promising and the sharing and clustering approach clearly increases

9 This is a time scale that contracts and expands time based on seasonal activity or volatility of the time series (see Chapter 6).

(11.27)



the speed of convergence compared to the simple genetic algorithm described in the previous section.

When applied to the indicator optimization problem, the genetic algorithm with sharing and clustering runs into difficulties. If the fitness landscape contains too many sharp peaks of high fitness, all the selected clusters concentrate around these peaks and the genetic algorithm is unable to find robust solutions. In the next section, we propose some modifications to the genetic algorithm to detect clusters in the parameter space that correspond to more general and robust solutions.

Modified Sharing Function for Robust Optimizations We need to find a new genetic algorithm that avoids the concentration of many individuals around sharp peaks of high fitness but detects broad regions of the parameter space containing a group of individuals with a high average fitness level and a small variance of the individual fitness values.

To solve this problem, we propose a new sharing function that penalizes clusters with large variance of the individual fitness values and also penalizes clusters with too many solutions concentrated inside too small a region. The distance metric considered here is the Euclidean distance computed in the real parameter space (phenotypic sharing). In the proposed sharing scheme, all the individuals that belong to a given cluster will share the same fitness value,

where Nc is the number of genes in the cluster c, fc is the average fitness value, and the standard deviation of the individual fitness values o(fc) is defined by

As the method is based on the distribution of gene fitness inside each cluster, we keep only the clusters that contain at least a minimum number of members. We use a minimum cluster size of two individuals. As we also need to keep enough clusters of reasonable size, we have to limit the size of the largest clusters. The term Nc/Nav in Equation 11.28 is used to control the number of genes inside each cluster. If Nc is smaller than the expected average number of genes inside each cluster Nav, the correction is reduced, otherwise it is increased. Here, the constant Nav is chosen such that the population size is divided by the (preconfigured) expected number of clusters.

The second term (1 - rd)/rd in Equation 11.28 is used to penalize clusters with too high a concentration of genes around their centroid. The value rd is defined as

(11.28)

(11.29)

(11.30)



[start] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [ 113 ] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134]