Go to The library page for information on where to get the library and what's included in the package.
Introduction
Differential Evolution (DE) is a population based stochastic function optimizer algorithm developed by Kenneth Price and Rainer Storn in the 1990s. For detailed information on DE consult any of the numerous online or in print resources listed at the DE homepage
http://www.icsi.berkeley.edu/~storn/code.html.
To quote from the Wikipedia page dedicated to DE (
http://en.wikipedia.org/wiki/Differential_evolution):
In computer science, differential evolution (DE) is a method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Such methods are commonly known as metaheuristics as they make few or no assumptions about the problem being optimized and can search very large spaces of candidate solutions. However, metaheuristics such as DE do not guarantee an optimal solution is ever found.
DE is used for multidimensional real-valued functions but does not use the gradient of the problem being optimized, which means DE does not require for the optimization problem to be differentiable as is required by classic optimization methods such as gradient descent and quasi-newton methods. DE can therefore also be used on optimization problems that are not even continuous, are noisy, change over time, etc.
Real use cases
Algorithmic trading systems optimization
This library was initially developed to be used for trading system optimization, and this section shows why.
An algoritmic trading system (TS) can be defined as a set of rules that takes market data as input and generates trading signals as output. These rules are often based on various statistical functions that are applied to stock prices, such as moving averages, standard deviation, which in turn depend on variables like the moving average period etc.
Some believe that a TS may perform better if they are optimized to adapt to current market conditions. To optimize a TS, it will be repeatedly applied to past data using different sets of variable values until the best hypothetical result is obtained for that period. The best parameters then are applied in real time.
Due to the large variables space (total number of distinct variable value sets), exhaustive optimization (EO), which traverses all these sets is usually impractical. For example 4 variables taking 10 distinct values each would require 10000 trading system runs. Other optimization methods may also be hard or impossible to apply due to the nature of the problem and its data.
DE however works well with this type of problem, and could be applied to reduce the explored variable space by orders of magnitude. For example, a DE optimization session could be set for a population of 10 individuals and a maximum number of generations of 20 as termination criteria (please refer to the terminology section for information on these terms). This would require at most 200 system runs, and often less, if other termination criteria is used such as a performance target x% better than a benchmark (S&P500 or other index). One may doubt the fact that DE offers an advantage over EO as the same termination criteria could be used when running exhaustive optimization, which would also limit the number of runs. But the DE algorithm is designed to converge much faster toward an optimium than a method that simply traverses the variables space.
Please note that it is not proven whether this method actually helps improve a trading system performance, the main point being howerver that DE can be used to obtain a good enough value in a reasonable number of tries.
Other uses
A quick Google search will yield many practical application of DE. Here are a few references to such articles just to illustrate the variaty of domains DE is applicable to:
Target audience
DE can be useful to many involved in scientific or engineering areas such as: chemistry, electronics, physics, agriculture, telecommunications, software engineering etc.
Using the library doesn't require prior understanding of its or DE internals. Developers with average C++ skills should be able to use it "out of the box". The library provides default behavior or a selection of several pre-defined policy classes, so that those who are not interested in the DE internals will only have to plug-in their objective function, select the DE runtime parameters, build the code and run it.
Those who need more advanced customization of the DE algorithm or are interested in researching DE itself, can create their own implementations of the various classes that control the algorithm behavior.
Rationale
This component is intended to address the need for a standardized generic, portable and efficient C++ DE library that can be easily customized for a variety of domain specific problems.
Features of this implementation
- Generic - doesn't make assumptions about the OS, environment it's running in or the problem to optimize
- Written in 100% portable C++, with boost and stl as only dependencies
- Header only project
- Modular
- Robust and memory efficient due to extensive use of shared pointers
- Performant thanks to platform independent multi-processing
- Adjustable and extensible through classes that control its behavior
- Easy to integrate with existing objective functions
- Supports several types of variable constraints useful in real life applications
Terminology
Objective Function - the function to optimize.
Cost - numeric output of the objective function for a given set of input variables.
Constraint - defines the domain of acceptable values for a variable. For example variables can be of real, integer type with values limited to a specific range. Other types include set, which can only take one of the several values in a set, or Boolean type, which can be true or false (1 or 0).
Variable - a value limited by a constraint
Individual - member of a population containing a collection of variables (implemented as a vector), each defined by its own constraints, dictated by the nature of the objective function. These values are passed to the objective function during the optimization process in order to calculate the function cost, which is also stored by the individual. Note: DE requires a larger number of variables than used by the objective function. For example if we are optimizing a function in two variables f(x,y), DE may require a vector of 20 total variables for its internal processing.
Population - a collection of individuals (implemented as a vector). During the optimization process, the population changes according to mutation algorithms to attempt to generate the optimium individual.
Mutation - the process of modifying the variable values according to one of the several mutation algorithms. The mutation algorithms can be tuned with the help of two parameters: weight and crossover. Consult the detailed DE documentation online for more information on these parameters.
Selection - the process of selecting the individuals for the next generation, of the best individual for the current generation, and of the best individual for the entire process.
Processor - one objective function processing unit. Each processor runs in its own thread on the same computer, but the processor class can be modified to distribute work across multiple computers. Any number of processors can be created and used during an optimization for parallel processing.
Generation - one iteration during the optimization process, which transforms one population in a new one, by mutating some of its individuals.
Termination - the logic used to determine when the optimization has completed.
The DE Algorithm
The algorithm has been modified somewhat in this implementation to allow for multi-processing and use of policy classes, and this its description.
- DE starts by creating an initial population, with individuals set to random variable values within the limits defined by their constraints, and then calculates the cost (the objective function value) for those variables for each individual.
- DE then proceeds to iterate through the current population and mutate variables for each individual according to the mutation policy. All the mutated individuals are then pushed into the processing queue.
- The processors now start evaluating the objective function for variables associated with individuals in the processing queue. The result (cost value) is set into each corresponding individual. This continues until the queue is empty. All these individuals are now candidates for the next generation.
- DE now applies the selection policy to choose the individuals that will make it into the next generation, and also of the best individual for the current iteration and for the entire optimization process. The new generation becomes the current generation for the next iteration. There are currently two selection policies for the new generation:
- Sort the combined populations and select the first N individuals (where N is the population size) for the new population.
- Compare individuals with the same index from the old and candidate generation and pick the best of each for the new generation.
Practical tests have shown that 1 allows the optimization to converge to an optimum value faster than 2.
- DE applies the termination policy to determine if it should stop the process. The termination policy can be as simple as comparing the generation number with a constant representing the maximum allowed number of generations. Or it can be more complex, for example by analyzing the trend of best values for each generation and stopping when the best value change is at or below a predefined threshold.
- If the termination criterion has been met, the process stops and the best value returned to the user. Otherwise, the process restarts from 2. with the new generation as the current generation.
Note that the the DE efficiency and even ability to converge to an optimum value depend (heavily for some objective functions) on its parameters: number of variables (total, not just the ones required by the objective function), population size, weight and crossover factors, as well as the different policies (mutation, selection, termination). Consult the online resources on this topic to get the most out of DE, and experiment with your own values as well.