Paper said +$484. The broker said −$346.

AlgoProven Research · June 2026 · #1 in a series on why backtests lie

We run our own futures bots on prop-firm accounts, and we log everything twice: once from the bot's point of view, once from the broker statement. This spring the two views disagreed by more than $800 on the same account. This post is the autopsy — and the reason we now refuse to trust any backtest that doesn't model fills.

The discrepancy

One of our accounts showed a healthy bot-reported P&L of +$484. The broker statement for the same period said −$346. Same trades, same account, same calendar days. On a single crude-oil session the bot booked +$95.10 while the broker recorded −$240 for the identical position.

Nothing was broken. Both numbers were "correct" — for the price each system believed the trade happened at. The bot anchored its accounting to the signal price. The broker anchored it to the fill.

How a fill artifact works

Most retail algo stacks (and most backtests) work like this: a bar closes, a condition fires, a market order goes out, and the system books the entry at the bar-close price that generated the signal. Live, we measured what actually happens in that gap on a micro crude (MCL) breakout:

Price
Signal / intended entry (bar close)87.43
Actual broker fill, ~1s later87.64
Adverse gap21 ticks = $21 per contract, entry alone

Momentum entries make this systematically adverse: the same impulse that fires your signal is moving price away from you while your order travels. Pay it on entry, often again on exit, and a strategy that nets less than ~$40 per trade on paper is — arithmetically — a loser live. Our paper engine was silently crediting us that gap on every single trade.

So we put every strategy behind a gate

We rebuilt our validation on an engine that simulates real order mechanics (market orders filled off the book, not at the signal print) and re-ran everything we traded or considered trading: four intraday strategy families across ES, NQ, CL and GC — 2008–2026, 143M data points.

All four families died. Profit factors that looked like 1.5–4+ on signal-anchored accounting collapsed to ≤1.0 once entries were filled the way a broker fills them. The live confirmation had already arrived by then: 40 live demo trades, −$2,554.88 net, 35% win rate — from systems whose paper curves pointed up and to the right.

Then we went further and brute-forced our own data for replacements: 1,000 condition × direction × holding-window combinations, locked protocol (train 2008–17, test 2018–26, 3× friction hurdle, t ≥ 3). Survivors: zero. Even the best in-sample candidate (+$166/trade, t = 4.9 in training) fell to +$44, t = 1.4 out of sample.

What actually survived

Two things — both boring, both thin, both real on 18+ years of data with real fills:

EdgeProfit factorConsistencySample
Overnight session-conditioned long (index futures)1.3018 of 19 years positiveN = 3,367
EU-session order-flow direction read1.10–1.168 of 8 years positive (2019–26)multi-year, both indices

A profit factor of 1.3 is not a get-rich curve. It is, however, what a real, surviving edge tends to look like once the fill subsidy is removed. And here is the uncomfortable part for prop traders: an edge this size passes or fails an evaluation mostly on rule math — daily loss limits, trailing drawdown, consistency percentages — not on the strategy itself. A thin edge with violated rule math is indistinguishable from no edge at all.

If you take one thing from this post: open your backtest and check what price your entries are booked at. If it is the same bar close that generated the signal, your equity curve contains a subsidy the market will not pay you. Re-run with fills anchored to the next tradeable price and watch what happens to the profit factor.

Why we're publishing this

Because we lost real money learning it, the data is ours, and the failure mode is universal. AlgoProven exists to handle the part that kills algo traders after the strategy work: tracking every prop-firm rule in real time, locking in passes before they can be given back, and keeping the accounting anchored to broker truth instead of bot optimism. Our own bots run behind it, in public, right now — including the losing days.