Futures trading systems and commodity trading bear a high degree of risk. People can and do lose money. Hypothetical results have many inherent limitations. Past performance does not guarantee future results. Hypothetical results have many inherent limitations. Please read the disclosures & disclaimers page.
TradingVisions Systems, Inc. Spokane WA 99217-7737 800-878-1983 : 509-466-8435
One way to achieve systems that are robust and give a reasonable expectation of future success is to test all available data for a market, adopt the "best" rules and parameters, and trade the system in real-time. We can call this approach look-backward optimization.
An improved approach is to not just look at all available data for one market, but to also look at additional markets, and to adopt the rules and parameters that best fit all or most of them. This approach--what I'll call look-beside optimization--is one I have used since I began developing systems over a dozen years ago.
I have been a firm believer that over-curve-fitting is the bane of system development, and to avoid it, I have been very reluctant to re-optimize systems once released. It is just too easy to find something that works on a single market for a few years in backtesting and re-jigger it every time results are poor.
My approach has worked quite well, but for the last several months, I have been intensively testing a way of judiciously using re-optimization that is proving to be preferable. This involves a proprietary approach to walk-forward optimization (WFO), and it helps to solve two critical shortcomings of traditional optimization: 1.) traditional hypothetical performance records are idealized, "in-sample" results that usually vary widely from real-time results, and 2.) traditional optimization has no articulated way to adapt to different markets or to changes within a market.
Walk-forward optimizes a system over a set time period from the past--say three years--and applied the "best" parameters to a succeeding time period, say one year. The performance results of the three years are "in-sample," meaning that they are derived from data (the "study period" data) that was used to formulate the system rules or parameters. These results are curve-fitted, idealized performance that is rarely matched in actual trading. The performance on that one year after the study period is "out-of-sample," meaning that they are the result of trading outside the period from which the parameters were chosen. This is also called the "application period." The performance from the application period is much more close to what would have been achieved in actual trading, since we could theoretically have been trading during that time with that set of rules and parameters. In fact, if we use realistic slippage and commissions, we have what can be called "real-time" results, even though they are hypothetical in the sense that the trades were not actually made. These results are categorically more trustworthy than backtested results from a system's study period, and this is the most important advantage to WFO.
The "walk-forward" aspect of this approach comes in the next step. If I used data from 1.1.2001 to 12.31.2003 as the first study period of the analysis and 2004 as my first application year, the 2004 trading results are my first year of the hypothetical performance record. I now walk-forward a year and use the data from 1.1.2002 to 12.31.04 as my next study period, and I apply the "best" parameters from this period to 2005. These 2005 out-of-sample results are added to my 2004 hypothetical performance record. I continue to walk-forward to 1.1.2003-12.31.05, apply the best parameters to 2006, and I now have a 3-year hypothetical out-of-sample track record that should be reasonably close to what I can achieve in actual trading.
There are several additional benefits to WFO.
In the past, I have worked very hard to have little or no difference in rules/parameters for different markets trading the same system. Optimizing to each market using only in-sample data is very risky because there is no logical point at which to stop doing it. The almost inevitable result is over-curve-fitting and trading losses.
WFO allows for using different parameters for different markets because the backtested out-of-sample performance results for each market provide the validation of the methodology. When only in-sample data is used, the validation is based upon an incestuous and circular relationship between the study data and the rules/parameters adopted: the results justify the rules/parameters, but the rules/parameters have been justified by the results. WFO takes us out of this pernicious loop and puts our expectations on a realistic foundation.
WFO also allows us to make in-flight corrections of our system parameters. In the past I have set an unswerving hoped-for course from hypothetical results to actual profits by carefully constructing a system and then not deviating from it. This straight-line approach ignores the fact that markets continually and of necessity change nature: Markets are moving targets. WFO allows a system to adapt to market changes.
This is both its inherent advantage, and its Achilles heel.
The advantage is that when there is a significant change in a market, if a system has rules that are sensitive to that change, the re-optimization will cause parameters that better fit the change to rise to the top of the optimization run.
The Achilles heel is that if the market changes don't have enough persistence, the parameters will be changed just in time to mis-perform during the next phase of the market. And this is why WFO is not the Holy Grail: important decisions are made in doing WFO that leave it susceptible to over-curve-fitting. Among these decisions are how long a study period and application period to use, how many iterations are ideal in an optimization, , what makes the "best" measure of performance, what makes the "best" choice of the results in an optimization run, what period in time does one use as the start point of the first study period, and how much variance in approach is acceptable for different markets.
The answers to these questions are what make TradingVisions' approach to WFO proprietary.
For a WFO protocol to work, it must be robust, i.e. it must either be identical for all markets, or at least for each market sector (indices vs. currencies) or system type (day trading vs. position trading). This helps to minimize the chance of over-curve-fitting. The TradingVisions WFO protocol is the same for all markets.
My continuing analysis shows excellent results. Even when comparing WFO performance reports with idealized in-sample reports, WFO often is better. And in over 80% of the cases, walk-forward results are better--usually far better--than the results of having just stayed with the "best" parameters in the past. This is a powerful indication that walk-forward optimization is superior to in-sample optimization.
To see the first application of this exciting research, look at Delphi Universal, which has been revised using walk-forward optimization. The results for Delphi ER Day and Delphi EM Swing show the potential of this new approach. The improvement WFO has made in Delphi is most dramatic in a market-change year like 2006. For the e-mini Midcap, original Delphi made $700 in 2006 and is down $3,000 this year, through the end of June (this is using the TradeStation report, with $50 slippage/commission), with a max drawdown of $10,000. Delphi II would have made $9,080 in 2006 and $1,465 this year, with a max drawdown of $6,200. Keep in mind that these results for Delphi II are not idealized, backtest-optimized results, but are walk-forward, using the rules and parameter values determined from testing prior years, so this is a fair and compelling comparison to the original Delphi results.