Preprint
Article

This version is not peer-reviewed.

Multi-Timeframe Feature Engineering for Bitcoin Market Prediction: A Price-Level-Agnostic Machine Learning Approach

Submitted:

11 March 2026

Posted:

12 March 2026

You are already at the latest version

Abstract
Predicting profitable entry signals in Bitcoin markets remains challenging due to price volatility, the absence of fundamental valuation frameworks, and methodological pitfalls that are common in the literature. In this study, we evaluate five machine learning classifiers using a 37-feature hierarchical multi-timeframe pipeline with price-level-agnostic normalisation across four temporal resolutions (15-minute, 4-hour, daily, and 3-day), spanning January 2020 to November 2025. Binary training labels were generated via majority-vote aggregation across 54 stop-loss/take-profit combinations, producing 6,951 balanced samples (48.5% positive class). Five algorithms --- Logistic Regression, Decision Tree, Random Forest, XGBoost, and LightGBM --- are compared using expanding-window TimeSeriesSplit validation (5 folds). Random Forest achieved the highest cross-validated ROC-AUC (0.6086), with all models showing modest but consistent discriminative ability (range 0.57--0.61). Feature importance analysis identifies 4-hour Bollinger Band position and RSI as dominant predictors, with all timeframes contributing meaningfully. A true out-of-sample holdout on 1,136 independently generated 2025 samples confirms generalisation, with Logistic Regression achieving 0.6087 ROC-AUC. A subtle multi-timeframe look-ahead bias in higher-timeframe data alignment is identified and corrected, which inflated performance by approximately 0.20 ROC-AUC points before correction. Event-driven backtesting on 2025 out-of-sample data yields a gross upper-bound return of +35.97% (185 trades, SL=1%, TP=2%, threshold=0.7, Sharpe=0.14) before transaction costs; after realistic round-trip fees, net returns are likely negligible. The central finding is that models with ROC-AUC $\approx$ 0.60 cannot reliably generate economically significant returns once transaction costs are accounted for. The methodology provides a reproducible framework for ML-based binary classification studies requiring transparent, bias-corrected validation across diverse market regimes.
Keywords: 
;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated