Out-of-sample and walk-forward testing

Out-of-sample and walk-forward testing

Knight, Sheldon

Traders typically fall into two groups, discretionary and systematic. Discretionary traders rely mainly on instinct and experience with help from charts and simple indicators. These traders believe that the human brain is a better computer than the electronic kind. Systematic traders use computers to rigorously analyze past price data and develop mathematical models to signal trade entries and exits.

So, which group is more profitable? Traders report their earnings monthly and it is a fact that professional, systematic traders consistently make more money than discretionary traders do. Even so, many systematic traders lose money. The government constantly reminds us that past performance is no guarantee of future results, yet most traders develop trading strategies against historical data. And commercial testing software has made this fast and easy, even for people with limited computer or mathematical skill. However, most systems developed this way fall apart due to what mathematicians call too few “degrees of freedom.” In trading parlance, it’s known as “curve fitting.”

Unfortunately, it is impossible to trade the past. Every time we test a strategy or a set of parameters on a set of data, the degrees of freedom of the results are reduced. The process of using the same historical data results in better performance, but serves to reduce the chance that the strategy will be profitable in real-world trading.

Avoid this pitfall by using “out-of-sample” and “walk forward” testing. First, divide the historical data into two segments covering different time periods. Develop the system using the first or “in-sample” period and then test the system to see how it performs on the second or “out-of-sample” period.

In effect, you are performing only one test on the out-of-sample data no matter how many ideas you tried on the insample data. The degrees of freedom are preserved, and out-of-sample performance is a much better indication of how the system will perform in real time. Of course, every time you repeat this process with a different system, degrees of freedom are lost. If you test enough times, the out-of-sample data become a part of the in-sample data, and the advantage of this method of testing is lost.

Walk-forward testing carries the idea of out-of-sample testing one step further. Think of it as out-of-sample testing on steroids. It works like this. Let’s say you have 10 years of data, 1995-2004, for the markets you want to trade. Let’s also assume that your trading strategy needs a minimum of three years of data for testing and optimization. To begin, you start by developing and optimizing your system using only the first three years of data – in this example, 1995-1997. On this three years of data, try as many ideas as you like and optimize parameters in as many ways as you can you think of. But do not look at any data after 1997 – don’t cheat! When you finally think that you have found the Holy Grail of trading systems, record the rules for your system and the optimum parameters. You will use these rules and optimized parameters for the final testing with new data starting with 1998.

Slide the three-year window of data forward a little – say one month. Now, the data that you are working with runs from the second month of 1995 to the second month of 1998. Repeat the analysis, including optimization, and record the rules and optimized parameters. In the final pass, use these parameters for the second month of 1997.

Continue walking forward and optimizing the three-year data periods. Record the results for use in the first month following the three-year optimization period. When your data finally runs out in 2004, go back and test the system for the entire period from 1998 to 2004. Switch the rules and parameters each month to use the ones that you found and recorded. In effect, you are performing a new out-of-sample test for each month. The system performance for these seven out-of-sample years (84 out-of-sample months) is a much better indication of how a system will perform in real time than the performance of any single time period used for optimization. There is nothing magic about the assumed periods – three years for system development and one month for the walk-forward interval. Picking these two time parameters is a trade-off between optimization time and statistical validity of the results. In practice, I have found that three years for optimization and up to three months for the walk forward interval work fairly well.

If the results of all of these out-of-sample months look good, you simply continue the walk-forward process in real time to find the parameters to use with real money. Another advantage to this method of system development and trading is that your system will better adapt to changes in market behavior over time. Markets do change with time – we have all seen systems that have made money for several years and then simply stopped working because the markets have changed.

Sheldon Knight is the president of K-Data Inc. E-mail: sknight@k-data.com.

Copyright Futures Magazine Group Apr 2005

Provided by ProQuest Information and Learning Company. All rights Reserved