Multi-Timeframe Feature Engineering for Bitcoin Market Prediction: A Price-Level-Agnostic Machine Learning Approach

Pedro Sobreiro; Domingos Martinho; Rui Martins; Ricardo Vardasca

doi:10.20944/preprints202603.0994.v1

Submitted:

11 March 2026

Posted:

12 March 2026

You are already at the latest version

Abstract

Predicting profitable entry signals in Bitcoin markets remains challenging due to price volatility, the absence of fundamental valuation frameworks, and methodological pitfalls that are common in the literature. In this study, we evaluate five machine learning classifiers using a 37-feature hierarchical multi-timeframe pipeline with price-level-agnostic normalisation across four temporal resolutions (15-minute, 4-hour, daily, and 3-day), spanning January 2020 to November 2025. Binary training labels were generated via majority-vote aggregation across 54 stop-loss/take-profit combinations, producing 6,951 balanced samples (48.5% positive class). Five algorithms --- Logistic Regression, Decision Tree, Random Forest, XGBoost, and LightGBM --- are compared using expanding-window TimeSeriesSplit validation (5 folds). Random Forest achieved the highest cross-validated ROC-AUC (0.6086), with all models showing modest but consistent discriminative ability (range 0.57--0.61). Feature importance analysis identifies 4-hour Bollinger Band position and RSI as dominant predictors, with all timeframes contributing meaningfully. A true out-of-sample holdout on 1,136 independently generated 2025 samples confirms generalisation, with Logistic Regression achieving 0.6087 ROC-AUC. A subtle multi-timeframe look-ahead bias in higher-timeframe data alignment is identified and corrected, which inflated performance by approximately 0.20 ROC-AUC points before correction. Event-driven backtesting on 2025 out-of-sample data yields a gross upper-bound return of +35.97% (185 trades, SL=1%, TP=2%, threshold=0.7, Sharpe=0.14) before transaction costs; after realistic round-trip fees, net returns are likely negligible. The central finding is that models with ROC-AUC $\approx$ 0.60 cannot reliably generate economically significant returns once transaction costs are accounted for. The methodology provides a reproducible framework for ML-based binary classification studies requiring transparent, bias-corrected validation across diverse market regimes.

Keywords:

bitcoin

;

machine learning

;

multi-timeframe feature engineering

;

temporal cross-validation

;

gradient boosting

;

price-agnostic features

;

look-ahead bias

;

binary classification

Subject:

Business, Economics and Management - Finance

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Multi-Timeframe Feature Engineering for Bitcoin Market Prediction: A Price-Level-Agnostic Machine Learning Approach

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe