StratBase.aiStratBase.ai
DashboardCreate BacktestMy BacktestsCatalogBlogNewsToolsHelp

Products

  • Researcher Dashboard
  • Create Backtest
  • My Backtests
  • Catalog
  • Blog
  • News

Alerts

  • Calendar
  • OI Screener
  • Funding Rate
  • REKT
  • Pump/Dump

Company

  • About Us
  • Pricing
  • Affiliate
  • AI Widget
  • Contact

Legal

  • Privacy
  • Terms
  • Refund Policy

Support

  • Help Center
  • Reviews
StratBase.aiStratBase.ai

Think it. Test it.

StratBase.ai does not provide financial advice or trading recommendations. AI only formalizes user ideas into testable strategy configurations for research purposes. Past backtesting performance does not guarantee future results. All trading decisions and associated risks are the sole responsibility of the user. This platform is not a broker and does not facilitate real trading.

© 2026 StratBase.ai · AI-powered strategy research and backtesting platform

support@stratbase.ai
Out-of-Sample Testing: Why Your Backtest Needs It
How-ToENout-of-sample testingstrategy validation

Out-of-Sample Testing: Why Your Backtest Needs It

Sarah Chen2/28/2026(updated 5/3/2026)5 min read423 views

Here's a scenario I see weekly: a trader optimizes a strategy on all available data, gets spectacular results, and declares the strategy ready for live trading. Two months later, they're scratching their head wondering why real performance doesn't match the backtest. The answer is almost always the same — they never tested on data the strategy hadn't already seen.

Out-of-sample testing is the absolute minimum bar for strategy validation. Not the gold standard — that's walk-forward analysis. OOS is more like the entrance exam. If your strategy can't pass this basic test, it shouldn't get anywhere near real capital.

The Contamination Problem

When you develop a strategy, every decision you make is influenced by the data you're looking at. You chose RSI over MACD because RSI performed better on your data. You set the period to 14 because that looked optimal. You added a volume filter because it improved results on your data.

Every one of these decisions "uses up" information from your dataset. By the time you've made 20 development decisions, your strategy has been sculpted to fit your specific dataset even if you never explicitly "optimized" anything. This is called implicit overfitting, and it's far more common than the explicit parameter-cranking version.

The only cure is untouched data — a clean sample that had zero influence on any development decision.

How to Implement OOS Testing Correctly

Step 1: Split your data before you start developing. This is critical. If you split after development, you've already been influenced by the full dataset. A common and correct approach:

PortionUseTypical SizeWhen to Touch
Training setStrategy development & optimization60-70%Freely during development
Validation setTuning decisions (optional)10-15%Sparingly during development
Test setFinal evaluation20-30%Once only — the final exam

Step 2: Develop your strategy using only the training set. All parameter optimization, indicator selection, filter testing — everything happens on the training data only. Pretend the test set doesn't exist.

Step 3: Use the validation set for intermediate checks. This is optional but valuable. After making significant changes, test on the validation set to check you're not drifting into overfitting. The validation set gets "used up" over time, which is why the final test set remains untouched.

Step 4: Run the test set exactly once. When you're satisfied with your strategy, run it on the test set. Don't modify anything afterward. If the results disappoint you, do not go back and "adjust" the strategy using test set feedback — that contaminates it.

Interpreting OOS Results

The absolute performance on the OOS period matters less than the relative performance compared to in-sample. Here's how to interpret the ratio:

OOS / IS RatioInterpretation
> 80%Excellent — strategy is robust, minimal overfitting
50-80%Good — some performance degradation but strategy has real edge
30-50%Concerning — significant overfitting, simplify the strategy
< 30%Failed — strategy is likely overfitted, reject or redesign
NegativeClearly overfitted — the in-sample results were illusory

Some degradation is expected and normal. Real markets have transaction costs, changing volatility, and evolving microstructure that backtest data may not perfectly capture. A 30-40% degradation is typical for decent strategies. If your OOS performance equals or exceeds IS performance, double-check your implementation — you might have a data leak.

Common OOS Mistakes

Peeking at the test set. The most common and most destructive mistake. If you look at how your strategy performs on the test set, then modify the strategy, then test again — you've just turned your test set into a second training set. Every peek contaminates the sample.

Choosing the split point to favor results. If you try multiple split points and use the one where OOS looks best, you've optimized the split point — which is another form of overfitting. Choose your split before development and commit to it.

OOS period too similar to IS period. If your in-sample period is a bull market and your OOS period is the continuation of the same bull market, you're not testing robustness — you're testing within the same regime. Ideally, the OOS period should contain at least one market condition that differs from the IS period.

Not enough trades in OOS. If your OOS period only contains 15 trades, the results are statistically meaningless regardless of whether they're positive or negative. You need at least 50 trades, preferably 100+, for reliable OOS conclusions.

"In God we trust. All others must bring data — out-of-sample data." — Adapted from W. Edwards Deming. The original quote applies to manufacturing quality, but it's equally valid for strategy quality.

Beyond Simple OOS: Anchored Walk-Forward

A more sophisticated approach is anchored walk-forward: keep the start date fixed but progressively extend the in-sample period, re-optimizing at each step. This gives you multiple OOS tests while maintaining a growing training dataset.

The progression looks like this: Year 1-2 train / Year 3 test → Year 1-3 train / Year 4 test → Year 1-4 train / Year 5 test. Each test period is completely fresh, and the training set grows with each iteration.

This method is less common than rolling walk-forward but has the advantage of never "forgetting" older data that might contain valuable patterns from rare market events.

For the complete validation toolkit, combine OOS testing with anti-overfitting techniques and results interpretation.

Validate your strategies properly. StratBase.ai makes it easy to split data into in-sample and out-of-sample periods and compare performance — ensuring your backtest results aren't just historical artifacts.

FAQ

What is out-of-sample testing?

Holding back data that you never use during development. After optimizing on in-sample data, you test once on the held-out portion to simulate performance on unseen data.

What percentage should be out-of-sample?

Standard is 70/30 (IS/OOS). Some use 60/40. The OOS period needs 50-100+ trades and ideally different market conditions.

Further Reading

  • RSI on Investopedia
  • MACD on Investopedia
  • Backtesting on Investopedia

About the Author

S
Sarah Chen

Quantitative researcher with 8+ years in algorithmic trading and strategy backtesting. Specializes in technical indicator analysis and risk-adjusted performance metrics.

FAQ

What is out-of-sample testing?▾

Out-of-sample (OOS) testing means holding back a portion of historical data that you never use during strategy development. After optimizing your strategy on the in-sample portion, you test it once on the held-out data. This simulates how the strategy would perform on unseen data.

What percentage of data should be out-of-sample?▾

The standard split is 70% in-sample and 30% out-of-sample. Some practitioners use 60/40 for more conservative validation. The out-of-sample period should contain at least 50-100 trades and ideally cover different market conditions than the in-sample period.

Further reading

How to Validate a Trading Strategy Before Going LiveWalk-Forward Analysis: The Gold Standard of Strategy Validation

Related articles

validate trading strategywalk forward analysis guiderobustness testing strategiescurve fitting vs real edgewhy most trading strategies fail

Comments (0)

Loading comments...