Detectors built on transformer language models report near-perfect accuracy on standard fake-news benchmarks, which suggests the task is almost solved. We argue that much of this accuracy reflects a confound between writing register and veracity: in common benchmarks the real class is human-written while the fake class is machine-generated or machine-rewritten, so a detector can separate the classes by recognizing AI writing style rather than by judging truth. To test this, we designed a two-regime evaluation. Phase 1 is the standard setup, comparing untouched human-real articles against laundered fake articles. Phase 2 is register-controlled: the real class is passed through the same cross-LLM laundering chains as the fake class, so both classes share one machine register and only veracity separates them. We train seven detectors on three datasets and evaluate each frozen detector under both regimes. Under Phase 1 detectors appear robust; under Phase 2, detection on WELFake collapses from about 99% to about 62% AUROC and the largest models approach chance. The effect is benchmark dependent, large on WELFake, mild on IFND and near zero on GossipCop, and it is confirmed by bootstrap testing with false discovery rate control. We recommend register-controlled evaluation as standard reporting practice.